PyLly RE parsing problem

4 views
Skip to first unread message

J.P. Larocque

unread,
Sep 17, 2005, 8:53:58 AM9/17/05
to Tim Newsham, py...@googlegroups.com
Hello,

I seem to have happened across a bug. PyLly bombs on this:
INITIAL:
"(b+)*": return

A more practical example that also fails:
INITIAL:
"(ab+)*": return

(This is a simplified case; the expression in my actual .ply file that
bombed was "{atext}+(\.{atext}+)*". Replacing it with
"{atext}+(\.{atext}{atext}*)*" works.)

Note that this does not fail (save "acceptance of the empty string
from starstate" warning):

INITIAL:
"(ab+c)*": return

Traceback for first example follows:

---8<---8<---
$ python -c 'import pyggy; pyggy.getlexer ("break.pyl")'
generating break_lextab.py from break.pyl
Traceback (most recent call last):
File "<string>", line 1, in ?
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 46, in getlexer
generate(specfname, tab, debug=debug, forcegen=forcegen)
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 29, in generate
pylly.parsespec(fname, targ, debug=debug)
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/pylly.py", line 126, in parsespec
helpers.proctree(tree, gt)
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 108, in proctree
return p.proctree(t)
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 81, in proctree
return self.proctree(t.possibilities[0])
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 88, in proctree
kids = map(self.proctree, t.elements)
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 81, in proctree
return self.proctree(t.possibilities[0])
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 88, in proctree
kids = map(self.proctree, t.elements)
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 81, in proctree
return self.proctree(t.possibilities[0])
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 88, in proctree
kids = map(self.proctree, t.elements)
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 81, in proctree
return self.proctree(t.possibilities[0])
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 88, in proctree
kids = map(self.proctree, t.elements)
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 81, in proctree
return self.proctree(t.possibilities[0])
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/helpers.py", line 90, in proctree
return self.gram.semactions[prodno](kids)
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/pylly_gramtab.py", line 102, in action15
return n.starclosmach(mach), str+"*"
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/nfa.py", line 192, in starclosmach
return self.optmach(mach)
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/nfa.py", line 185, in optmach
self.addtran(mach.end, end)
File "/usr/local/stow/pyggy-0.4.1/lib/python2.3/site-packages/pyggy/nfa.py", line 132, in addtran
assert next2 == NOSTATE and (trset == EPSILON or next1 == NOSTATE)
AssertionError
--->8--->8---

--
J.P. Larocque is <pir...@thoughtcrime.us> and <pir...@ely.ath.cx>
Encrypted/signed e-mail preferred; http://ely.ath.cx/~piranha/pgp
Fpr 5612 10A8 4986 2D85 A995 252B 4C02 5E02 F61D 2E61; ID 0xF61D2E61

Tim Newsham

unread,
Sep 17, 2005, 2:58:30 PM9/17/05
to py...@googlegroups.com, piranha...@thoughtcrime.us
On Sat, 17 Sep 2005, J.P. Larocque wrote:
> I seem to have happened across a bug. PyLly bombs on this:
> INITIAL:
> "(b+)*": return
>
> A more practical example that also fails:
> INITIAL:
> "(ab+)*": return

Yes, this came up recently and I have a fix:

date: 2005/06/14 01:16:38; author: newsham; state: Exp; lines: +61 -47
- fixed a bug in nfa.py
- The last state of a machine must always maintaint at least one
free outgoing slot! This was violated in posclos because we'd
sloppily reuse an empty slot of an existing machine. This made
regexp like (a+)+ fail on an assertion (thank god).

The attached patch has this fix. It changes the way positive
closure is handled; I had originally made a bad assumption.
The patch is slightly obfuscated by changes to move comments
into doc strings, but the relevant bits are the changes in
the posclosmach method.

I apologize that there hasn't been a new release with the rolled
up bug fixes in a while. I've been very busy and pyggy's maintenance
has suffered for it. I will try to get the CVS repository online
at least so people can access it.

> J.P. Larocque is <pir...@thoughtcrime.us> and <pir...@ely.ath.cx>

Tim Newsham
http://www.lava.net/~newsham/
nfa-patch.txt
Reply all
Reply to author
Forward
0 new messages