Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

YACC for dynamic grammars

10 views
Skip to first unread message

jo...@ima.uucp

unread,
Feb 5, 1987, 12:58:58 AM2/5/87
to
For a simple dynamic grammar in YACC:

Define operator tokens as follows:

%left OL1
%right OR1
%left OL2
%right OR2
%left OL3
%right OR3
etc.

The lexer for 'C' already has to lookup names in the symbol
table to see if they are typedef_names or identifiers.

You will now lookup operator names also to see what class (OL1, etc.) they
are currently. The symbol table is loaded initially with
standard operators and precedences. The currently defined class
is returned as the lexical token for yacc, and the actual
operator name is returned as the value (for semantic analysis).

For even more generality, the lexer should recognize three character
classes: alpha, whitespace and operator. Name tokens are alpha characters
seperated by whitespace or operators. Operator strings are sequences
of operators seperated by alpha. Operator strings are divided into
tokens by finding the largest left substring which is currently in
the symbol table (repeat as required). All tokens are looked up in
the symbol table to decide if they are

a) reserved names (e.g. register) or operators (e.g. '{');
return appropriate token.
b) type names; return TYPENAME token.
c) identifiers; return IDENTIFIER token.
d) operators (both names and operators!); return class as token.

The symbol table is preloaded (from a control file, perhaps) with the
standard 'C' reserved words, types, and operator hierarchy.

The user can #include custom grammars for various applications:

complex arithmetic
complex polynomial arithmetic
string processing
etc.

Even reserved words can be redefined!

The compiler need only hardwire a minimal subset of reserved words.
Operators can be bound to functions or 'C-macros' based on operator
name and operand types. Standard types are symbolic and used only for
binding purposes. C-macros are machine dependent functions
expanded inline by the code-generator, all other operators are
function calls (or inline expansions). A function call can be
replaced by a C-macro in a specific implementation for efficiency purposes.
The first cut of a code generator can do everything except function
calls with function calls! The various operators can then be added
as C-macros as time permits. (Notice that argument passing for function
calls needs to recognize types.)

A similar approach will allow the user to define his own control
structures.

The key performance issue here is the symbol table - it has got to
be fast, because it gets consulted by the lexer for every token!

Unless the programmer excerises some restraint, a language of this
nature will result in cryptic code!

Although having the lexer check the symbol table for every token is more
expensive than a fixed table driven lexer, notice that the preprocessor
has to look up every token anyway. By combining the 'C' preprocessor
with lexical analysis, you can provide a dynamic grammar and speed up
compilation at the same time!

Every token would be looked up by the lexer to determine if it is

a) a preprocessor symbol
b) a typedef name
c) a reserved word
d) an identifier
e) an operator

and apropriate tokens passed to YACC.
--
Stuart D. Gathman <..!seismo!dgis!bms-at!stuart>
[It's certainly true that you can set up a yacc-based parser quite easily
that lets you add new tokens that behave syntactically just like other
existing tokens using this scheme. I believe that the intent of the
original question, though, was to add whole new productions to the grammar on
the fly as is possible with a parser using Earley's algorithm. The
consensus seems to be that since the yacc grammar tables are constructed
only after an extensive global analysis of the grammar, incremental changes
are unlikely to be possible without remaking the tables from scratch. -John]
--
Send compilers articles to ima!compilers or, in a pinch, to Lev...@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers. Meta-mail to ima!compilers-request

0 new messages