How do I define the AST for what's essentially an enumeration?

28 views
Skip to first unread message

Steve Kelem

unread,
Oct 8, 2012, 8:21:48 PM10/8/12
to sab...@googlegroups.com
I've searched for documentation on SableCC 3 ASTs, but haven't found a good description of the AST section.
I've pieced together a bunch of things, but I am now stuck.
I have a grammar that contains the following production(s):

    var_type = {event} event_token
    | {integer} integer_token
    | {parameter} parameter_token
    | {real} real_token
    | {reg} reg_token
    | {supply0} supply0_token
    | {supply1} supply1_token
    | {time} time_token
    | {tri} tri_token
    | {triand} triand_token
    | {trior} trior_token
    | {trireg} trireg_token
    | {tri0} tri0_token
    | {tri1} tri1_token
    | {wand} wand_token
    | {wire} wire_token
    | {wor} wor_token;

Each of the alternatives is a token formed from the (obvious) characters (e.g., a "time_token" is the token for when the word "time" has been found by the lexer.)
I want the var_type to be represented by a Java Enumeration with a value for each of the possible tokens.
  1. What goes in the AST section?
    var_def = ?;
  2. What goes in the code section for each alternative. I would think it would be something like:
    | {tri1} tri1_token { -> New var_type(TRI1) }
Thanks for your help.

Steve

Etienne Gagnon

unread,
Oct 9, 2012, 7:11:04 AM10/9/12
to sab...@googlegroups.com
Hi Steve,

For (1), I would define:
var_def = [...] | {enum} identifier | ...
For (2), there are no actions in SableCC grammars. To implement actions, you extend the DepthFirstAdapter class and override the appropriate methods:
...
  public void outATri1VarType(ATri1VarType node) {
    // do your actions here
  }
This is explained in chapters 3 and 6 of my M.Sc. thesis.

Have fun!

Etienne
Etienne Gagnon, Ph.D.
http://sablecc.org
On 2-12-10-08 20:21, Steve Kelem wrote:
[...]Each of the alternatives is a token formed from the (obvious) characters (e.g., a "time_token" is the token for when the word "time" has been found by the lexer.)

Steve Kelem

unread,
Oct 9, 2012, 11:52:57 AM10/9/12
to sab...@googlegroups.com
I'm not sure if you're saying that each of the alternatives for the productions for var_type should be

    var_type = {enum} event_token
    | {enum} integer_token
    | {enum} parameter_token
    | {enum} real_token
    | {enum} reg_token
    | {enum} supply0_token
    | {enum} supply1_token
    | {enum} time_token
    | {enum} tri_token
    | {enum} triand_token
    | {enum} trior_token
    | {trireg} trireg_token
    | {enum} tri0_token
    | {enum} tri1_token
    | {enum} wand_token
    | {enum} wire_token
    | {enum} wor_token;

or not define all these tokens, change the production to:
var_type = {enum} identifier;
and have the Visitor check that the identifier is one of the valid tokens and issue an error otherwise.
Having separate alternatives means that the parser would be the code detecting a misplaced identifier.
Pushing the check off to the AST visitor means that visitor would be responsible for the error checking, and could possibly continue, finding other errors.

Do you have a feel for which approach is "better"?

Thanks,
Steve

Steve Kelem

unread,
Oct 9, 2012, 11:54:24 AM10/9/12
to sab...@googlegroups.com
I'm not sure if you're saying that each of the alternatives for the productions for var_type should be (a):


    var_type = {enum} event_token
    | {enum} integer_token
    | {enum} parameter_token
    | {enum} real_token
    | {enum} reg_token
    | {enum} supply0_token
    | {enum} supply1_token
    | {enum} time_token
    | {enum} tri_token
    | {enum} triand_token
    | {enum} trior_token
    | {trireg} trireg_token
    | {enum} tri0_token
    | {enum} tri1_token
    | {enum} wand_token
    | {enum} wire_token
    | {enum} wor_token;
(which may not be valid SableCC, I'm not sure.)
or (b) not define all these tokens, change the production to:

Etienne Gagnon

unread,
Oct 9, 2012, 5:38:23 PM10/9/12
to sab...@googlegroups.com
Hi Steve,

Ask yourself: if there was as syntax error, in an input file, would I rather get:
  Syntax error at x,y: expecting event_token, integer_token, (a long list of tokens)...
or
  Semantic error at x,y: "foo" is not a valid enumeration
?

I think you'll get to the same conclusion I get to: use the semantic approach.


Have fun!

Etienne
Etienne Gagnon, Ph.D.
http://sablecc.org
On 2012-10-09 11:54, Steve Kelem wrote:
I'm not sure if you're saying that each of the alternatives for the productions for var_type should be (a):
    var_type = {enum} event_token
    | [...]
Reply all
Reply to author
Forward
0 new messages