Q: What we see is that the parse table file is imported and then, if 'optimize'
is enabled, it doesn't even matter if the digest matches or not... I think this
is a bug in the implementation? I think the 'or' in line 2615 should be an 'and'?
A:
The whole point of yacc optimized mode is mainly to greatly reduce startup time
and parsing performance. In this mode, yacc doesn't even compute the signature or
attempt to validate the table file at all. It simply loads it and runs with it.
Therefore, this is not a bug, but intended behavior. It was always imagined that
one would fully debug their grammar and get things working in normal mode. Then,
at the very end, you could enable optimize mode.
Q: Why is the import executed, even if we don't use optimized mode?
A:
yacc always needs to load its parsing tables from a module, regardless of the mode.
Q: The signature is updated with __tabversion__, method, the start symbol,
precedence rules and the doc-strings of grammar rules. I'm missing the token list
and the names of the grammar rules. Why is this? (There may be other things missing?)
A:
The signature only includes things that affect the underlying parsing tables.
Doc strings already have all of the grammar rules, token names, and rule names.
The start symbol and precedence rules are the only other things that would
change the underlying table.
Q: The first time lr_read_tables is called (line 2705) in the yacc function,
the signature only contains the __tabversion__, method and start symbol. I wonder
if it can ever match the signature stored in the parse table file? (note that due
to the 'or' in line 2615, the signature check if not even checked if optimize is
enabled)
A:
As noted above, in optimized mode there is no validation so the fact that the
signature contains anything at all is merely incidental.
Followup:
Just as a note, I am unlikely to change this implementation of optimize mode.
Startup time is often a critical issue in compilers/parsers. I really do want to
eliminate as much extra processing as possible in this mode.
Cheers,
Dave
> Q: The signature is updated with __tabversion__, method, the start symbol,
> precedence rules and the doc-strings of grammar rules. I'm missing the token list
> and the names of the grammar rules. Why is this? (There may be other things missing?)
> A: The signature only includes things that affect the underlying parsing tables.
> Doc strings already have all of the grammar rules, token names, and rule names.
> The start symbol and precedence rules are the only other things that would
> change the underlying table.
OK, I understand that the doc-strings contain the rule names, but I'm still missing the
actual names of the functions that the doc-strings belong to. The parse table file
contains those names (as well as file names). If I only rename a p_* function, the
parse table file becomes invalid as it refers to the old name, not the new name.
Dennis