problems with test2.py

Werner LEMBERG

unread,

Jun 18, 2005, 2:33:14 PM6/18/05

to py...@googlegroups.com

[pyggy 0.4.1, unmodified]
[python 2.3.3]

I wonder how test2.py shall work. According to the grammar and lexer
code I expect to see the message

skipping unknown letter

if I press, say, `a <Enter>'.

Instead, I get nothing. Even worse, if I then press `b <Enter>', I
get the following backtrace:

Traceback (most recent call last):
File "./test2.py", line 27, in ?
tree = p.parse()
File "/usr/lib/python2.3/site-packages/pyggy/glr.py", line 200, in parse
raise ParseError(self.tokval, self.errtoken)
pyggy.errors.ParseError: ParseError <type 'str'>

Somehow I have the feeling that the examples don't reflect how pyggy
works currently. Have you changed the syntax, without updating the
examples?

Reason for asking is that I'm not able to get a working engine. pyggy
accepts both my lexing and parsing code without warning, but whatever
I use as input, I get the same error as above. On the other hand,
pyggy itself works fine (but is too complicated for a python novice
like me), so I'm really clueless.

Please help.

Werner

Werner LEMBERG

unread,

Jun 18, 2005, 2:38:03 PM6/18/05

to py...@googlegroups.com

> [pyggy 0.4.1, unmodified]
> [python 2.3.3]
>
> I wonder how test2.py shall work.

I forgot to mention that I've applied the change below to make it work
at all.

Werner

======================================================================

--- test2.pyl 2003-08-13 00:30:33.000000000 +0200
+++ test2.pyl.new 2005-06-14 22:38:33.000000000 +0200
@@ -5,5 +5,5 @@
"\*" : return "TIMES"
"[[:alpha:]][[:alnum:]]*" : return "ID"
"\n" : return # ignore newlines
- "." : print "skipping unknown letter", self.tokdata[0]
+ "." : print "skipping unknown letter", self.value

Tim Newsham

unread,

Jun 18, 2005, 10:01:39 PM6/18/05

to py...@googlegroups.com

> I wonder how test2.py shall work. According to the grammar and lexer
> code I expect to see the message
>
> skipping unknown letter
>
> if I press, say, `a <Enter>'.
>
> Instead, I get nothing. Even worse, if I then press `b <Enter>', I
> get the following backtrace:
>
> Traceback (most recent call last):
> File "./test2.py", line 27, in ?
> tree = p.parse()
> File "/usr/lib/python2.3/site-packages/pyggy/glr.py", line 200, in parse
> raise ParseError(self.tokval, self.errtoken)
> pyggy.errors.ParseError: ParseError <type 'str'>
>
> Somehow I have the feeling that the examples don't reflect how pyggy
> works currently. Have you changed the syntax, without updating the
> examples?

When you type "a <enter>" it is still waiting for more input
before finishing its parse. When you then follow with "b <enter>"
you complete an invalid syntax and it complains. It is important
to realize that this is not a line-by-line lexer.

It behaves properly on the following input:

echo 'id+id+id' | test2.py

You are correct in your next email that there is a bug when
it receives an unexpected character. After changing the tokdata
reference to value as you point out it behaves properly:

$ echo 'id+id+id!' | test2.py
skipping unknown letter !
parse done: [(id + (id + id)) or ((id + id) + id)]

> Reason for asking is that I'm not able to get a working engine. pyggy
> accepts both my lexing and parsing code without warning, but whatever
> I use as input, I get the same error as above. On the other hand,
> pyggy itself works fine (but is too complicated for a python novice
> like me), so I'm really clueless.

If you want a line-at-a-time parse, you can fetch a line of
input with a normal python command, and then give that input
to a lexer with the setinputstr method. See the example in
ply_calc.py for more details.

Debugging parsers isnt very easy in pyggy right now. The best way
to debug a parser is through looking at the debug output of the parser
generator and manually playing the parsers part of following where
you would be after each input token. Before doing this you should
probably test your lexer to make sure it is giving you the tokens
you expect.

If you want some help feel free to post examples of what you are
working on here.

> Werner

Tim Newsham
http://www.lava.net/~newsham/

Tim Newsham

unread,

Jun 18, 2005, 10:03:30 PM6/18/05

to py...@googlegroups.com

> --- test2.pyl 2003-08-13 00:30:33.000000000 +0200
> +++ test2.pyl.new 2005-06-14 22:38:33.000000000 +0200
> @@ -5,5 +5,5 @@
> "\*" : return "TIMES"
> "[[:alpha:]][[:alnum:]]*" : return "ID"
> "\n" : return # ignore newlines
> - "." : print "skipping unknown letter", self.tokdata[0]
> + "." : print "skipping unknown letter", self.value

test2.pyl and test3.pyl seem to not have been updated properly.
Thank you for pointing this out.

Tim Newsham
http://www.lava.net/~newsham/

Werner LEMBERG

unread,

Jun 19, 2005, 5:05:46 AM6/19/05

to py...@googlegroups.com, new...@lava.net

> It behaves properly on the following input:

OK, thanks.

> echo 'id+id+id' | test2.py

Actually, I've always worked with input files, but it still fails.

> If you want some help feel free to post examples of what you are
> working on here.

OK, here it comes. The attached bundle contains an almost complete
lexer for MoinMoinWiki (which seems to run fine) and some first steps
for the parser, together with sample input files and little scripts to
run the lexer and parser. Whatever I input to the parser, I get

Traceback (most recent call last):

File "./HeyHeyWikiTestParser.py", line 12, in ?

tree = p.parse()
File "/usr/lib/python2.3/site-packages/pyggy/glr.py", line 200, in parse
raise ParseError(self.tokval, self.errtoken)
pyggy.errors.ParseError: ParseError <type 'str'>

Please help.

Werner

HeyHeyWiki.tar.bz2

Tim Newsham

unread,

Jun 19, 2005, 4:58:10 PM6/19/05

to py...@googlegroups.com

>> If you want some help feel free to post examples of what you are
>> working on here.
>
> OK, here it comes. The attached bundle contains an almost complete
> lexer for MoinMoinWiki (which seems to run fine) and some first steps
> for the parser, together with sample input files and little scripts to
> run the lexer and parser. Whatever I input to the parser, I get
>
> Traceback (most recent call last):
> File "./HeyHeyWikiTestParser.py", line 12, in ?
> tree = p.parse()
> File "/usr/lib/python2.3/site-packages/pyggy/glr.py", line 200, in parse
> raise ParseError(self.tokval, self.errtoken)
> pyggy.errors.ParseError: ParseError <type 'str'>

The error message has a slight bug that has since been fixed
(I guess it isnt in the version on the web page yet). The
ParseError class in errors.py should print out self.str and
not str. Your error should be:

pyggy.errors.ParseError: ParseError <<EOF>>

Your input file lexes as:
TOK_CHAR_LOWER a
TOK_CHAR_LOWER b
TOK_CHAR_LOWER c
TOK_NEWLINE 1
TOK_NEWLINE 2

The "abc" gets parsed as a "word". Then adding the first newline you get
a line. Then the parser wants to get a parsep to make a paragraph, but
you only have one newline and not two newlines, so when you get to the EOF
token there is a parse error. If you add an extra newline to the file it
parses properly. You probably want a single newline to reduce to
a parsep.

By the way, your parser is doing part of the job that is traditionally
assigned to the lexer -- it is building words out of characters. You
can do this in the parser, but traditionally it is done in the lexer
for performance and to avoid some other problems. Your approach is
legitimate but non-traditional (you're half-way to a "scannerless
parser"). You will have to beware of ambiguities that can arise.

Werner LEMBERG

unread,

Jun 20, 2005, 2:30:46 AM6/20/05

to py...@googlegroups.com, new...@lava.net

> The error message has a slight bug that has since been fixed

> (I guess it isnt in the version on the web page yet). [...]

I have a great talent finding bugs as soon as I touch some
software... I even caught one in Metafont. :-)

> The "abc" gets parsed as a "word". Then adding the first newline
> you get a line. Then the parser wants to get a parsep to make a
> paragraph, but you only have one newline and not two newlines, so
> when you get to the EOF token there is a parse error. If you add an
> extra newline to the file it parses properly. You probably want a
> single newline to reduce to a parsep.

Aah, yes. Thanks!

Werner

Reply all

Reply to author

Forward