FAQ on optimized modes of lex and yacc

96 views
Skip to first unread message

eliben

unread,
Nov 8, 2008, 3:03:15 AM11/8/08
to ply-hack
Hello,

I want to conclude these discussions:
http://groups.google.com/group/ply-hack/browse_frm/thread/22685d1b7fb728fe/f09ce65cf3a04e3a?lnk=gst&q=lex+optimized#f09ce65cf3a04e3a
http://groups.google.com/group/ply-hack/browse_frm/thread/a799e10f9c200444/edd49473b4aca366?lnk=gst&q=lex+optimized#edd49473b4aca366

With some insights gathered while attempting to build a PLY based
grammar for distribution into some kind of a FAQ.

If I understand correctly, this is the logic for the optimize modes of
yacc and lex:


yacc optimization:
- If parsetab.py/pyc doesn't exist in the path, the table will be
reconstructed anyway, regardless of the optimize parameter
- If it doesn't exist:
- If optimize=True, the table will be loaded unconditionally
- If optimize=False, the table will be loaded only if it's older
than the grammar

lex optimization:
- If optimize=False, the lexical table is re-computed and is not saved
to a lextab file
- If optimize=True:
- If lextab.py/pyc exists in the path, it will be loaded
unconditionally
- If lextab.py/pyc doesn't exist, it will be created and loaded


Please correct me if I'm wrong. Thanks !

Eli

eliben

unread,
Nov 8, 2008, 3:05:38 AM11/8/08
to ply-hack
[snip]
> yacc optimization:
> - If parsetab.py/pyc doesn't exist in the path, the table will be
> reconstructed anyway, regardless of the optimize parameter
> - If it doesn't exist:
[snip]

Sorry, this is a typo, should be "If it does exist" in the second IF.

Eli

eliben

unread,
Nov 14, 2008, 1:50:02 AM11/14/08
to ply-hack
On Nov 8, 10:03 am, eliben <eli...@gmail.com> wrote:
> Hello,
>
> I want to conclude these discussions:http://groups.google.com/group/ply-hack/browse_frm/thread/22685d1b7fb...http://groups.google.com/group/ply-hack/browse_frm/thread/a799e10f9c2...
>
> With some insights gathered while attempting to build a PLY based
> grammar for distribution into some kind of a FAQ.
>
> If I understand correctly, this is the logic for the optimize modes of
> yacc and lex:
>
> yacc optimization:
> - If parsetab.py/pyc doesn't exist in the path, the table will be
> reconstructed anyway, regardless of the optimize parameter
> - If it doesn't exist:
>   - If optimize=True, the table will be loaded unconditionally
>   - If optimize=False, the table will be loaded only if it's older
> than the grammar
>
> lex optimization:
> - If optimize=False, the lexical table is re-computed and is not saved
> to a lextab file
> - If optimize=True:
>   - If lextab.py/pyc exists in the path, it will be loaded
> unconditionally
>   - If lextab.py/pyc doesn't exist, it will be created and loaded
>

No comments ?

I find this issue quite important for the distribution of effective
PLY parsers as self-contained modules that provide maximal performance
and don't trouble the user. It's hard to believe that no one has used
these features and explored the way they work (as the documentation is
very incomplete on this matter).

D.Hendriks (Dennis)

unread,
Nov 14, 2008, 8:10:06 AM11/14/08
to ply-...@googlegroups.com
Hello,

> It's hard to believe that no one has used
> these features and explored the way they work

I looked at that (and more) some time ago. For our tooling
I created a wrapper around Ply. The feature of this wrapper are:
 - Class oriented PlyScanner and PlyParser classes.

 - Different modes (normal/debug/opt), which set different
   settings of Ply.

 - Table file creation is always done in the same directory
   as the source files.

 - Table files can be created for multiple classes in a single
   file, with different start symbols.

 - Automatic parse/scan error exceptions with line numbers
   (without enabling the (slow) option to let ply track line
   numbers).

 - White-space is automatically skipped in the scanners (can
   be modified by derived classes).

 - Functions to generate/remove the table files.

   - We always install the tooling we create on a central
     machine to which all students have read access. However,
     the students don't have write access and therefore can't
     generate the table files. We have *_gentables.py files
     that are used by the setup program to generate the table
     files.

 - Derived classes should only need to specify:

   - list of tokens (scanner)
   - t_* (scanner)
   - p_* (parser)

The plywrapper.py file is standalone and only needs Ply, except
for the exception base class from one other file in our tooling.
However, one could easily change it to inherit from StandardError
instead...

For the up-to-date code, see:
http://dev.se.wtb.tue.nl/projects/chi-tooling/browser/trunk/chinetics/chinetics/core/plywrapper.py
(for people reading this post a long time after I posted it: note
 that I can't guarantee that the link will work forever)

The plywrapper.py file contains some use/implementation instructions
and other useful information.

For some parsers based on this:
http://dev.se.wtb.tue.nl/projects/chi-tooling/browser/trunk/chinetics/chinetics/languages/cif/compiler/cifparser.py
http://dev.se.wtb.tue.nl/projects/chi-tooling/browser/trunk/chinetics/chinetics/languages/cif/compiler/cifparser_gentables.py
http://dev.se.wtb.tue.nl/projects/chi-tooling/browser/trunk/chinetics/chinetics/languages/common/compiler/exprparser.py
http://dev.se.wtb.tue.nl/projects/chi-tooling/browser/trunk/chinetics/chinetics/languages/common/compiler/exprparser_gentables.py

If you have any questions about plywrapper, feel free to ask. Also,
suggestions for improvements are always welcome.

Dennis


David Beazley

unread,
Nov 14, 2008, 8:29:23 AM11/14/08
to ply-hack, eliben

I thought I might have commented on this earlier, but maybe not (things have been rather crazy around here). On quick glance, the summary presented here seems
correct. I think the main thing to keep in mind is that if you're using PLY's optimized modes, all bets are off for any kind of error checking or validation. PLY will
simply assume that any sort of pre-existing table file (lex or yacc) is correct and start using it. To keep these tables tied to your application, it would make sense to
give them application specific names. However, that's about it.

I'll try to add a section in the docs about distributing PLY software in the next release.

Cheer,
Dave


On Fri 14/11/08 1:50 AM , eliben eli...@gmail.com sent:

> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google Groups
> "ply-hack" group.To post to this group, send email to ply
> -h...@googlegroups.comTo unsubscribe from this group, send email to ply-hack+
> unsub...@googlegroups.comFor more options, visit this group at http://groups.google.com/group/ply-hack?hl=en-~----------~----~----~----~-----
-~----~------~--~---
>
>
>
>


eliben

unread,
Nov 14, 2008, 9:35:57 AM11/14/08
to ply-hack

On Nov 14, 3:29 pm, David Beazley <d...@dabeaz.com> wrote:
> I thought I might have commented on this earlier, but maybe not (things have been rather crazy around here).  On quick glance, the summary presented here seems
> correct.     I think the main thing to keep in mind is that if you're using PLY's optimized modes, all bets are off for any kind of error checking or validation.   PLY will
> simply assume that any sort of pre-existing table file (lex or yacc) is correct and start using it.    

This is exactly the kind of behavior I would expect from a PLY-based
parser for %99.99 of its service, deep inside some program, parsing
stuff as quickly as possibly. The only occasion on which I can turn
optimization off and would want to enjoy the error checking is during
the development of the parser.

Thanks for answering

Eli

eliben

unread,
Nov 14, 2008, 9:38:26 AM11/14/08
to ply-hack
<snip>

Dennis,

This is great, thanks a lot for sharing the code for PlyWrapper. I
will certainly have a look and see what I can gather from it for my
own parser development with PLY.

Eli
Reply all
Reply to author
Forward
0 new messages