Understanding a grammar before update: What does ## mean in the rule actions ?

40 views
Skip to first unread message

Duncan Gibson

unread,
Nov 30, 2015, 11:02:29 AM11/30/15
to antlr-discussion
I have inherited an Antlr-2.7.7 application, last modified in 2007(!) and need to be able to understand what it's doing before I can think of upgrading.

My question is: what does the "##" represent in the rule below?

<tt>
    remark_set :
        REMARK_SET
        { ##.setRemarkList (Express_Lexer.getRemarkList());
        }
        ;

    simple_id :
        t1:BASED_ON^ (remark_set)?
            { #t1.setType(SIMPLE_ID);
                Messages.warning("Using Edition 2 keyword as identifier", #t1);
            }
        ...
        | SIMPLE_ID^ (remark_set)?
        ;
</tt>

I've found "#id" in http://www.antlr2.org/doc/trees.html#_bb1 and I've found mention of "##" in the Migrating from V2 to V3 guide, but there is still no explanation of what "##" actually means or does :-(

So far I have not been able to find this in the currently available tutorials and documentation, but then searching the web for "##" is tricky. Any links to older relevant documentation would be much appreciated.

I think I'm stuck for upgrading because I need the C++ runtime, but that's a question for another day.

Duncan

Mike Lischke

unread,
Nov 30, 2015, 12:02:24 PM11/30/15
to antlr-di...@googlegroups.com
It's been too long since I worked with that syntax but let me try ...

> My question is: what does the "##" represent in the rule below?
>
> <tt>
> remark_set :
> REMARK_SET
> { ##.setRemarkList (Express_Lexer.getRemarkList());
> }
> ;

I'm not sure about this. Could be a reference to the rule itself.

>
> simple_id :
> t1:BASED_ON^ (remark_set)?
> { #t1.setType(SIMPLE_ID);

This is obviously a reference to token t1 and sets a type. Use "{ $type = SIMPLE_ID; }" now.

> Messages.warning("Using Edition 2 keyword as identifier", #t1);
> }
> ...
> | SIMPLE_ID^ (remark_set)?
> ;
> </tt>
>

There is another form like in this rule:

common_resource_attributes:
(load_attribute | memory_attribute)+
{#common_resource_attributes = #(#[RESOURCE_ATTRIBUTES, "resource attributes"], common_resource_attributes);}
;

which would be in ANTLR 3 syntax:

common_resource_attributes:
(load_attribute | memory_attribute)+
-> ^(RESOURCE_ATTRIBUTES (load_attribute | memory_attribute)+)
;

Mike
--
www.soft-gems.net

Mike Lischke

unread,
Nov 30, 2015, 12:09:21 PM11/30/15
to antlr-di...@googlegroups.com

> Am 30.11.2015 um 18:02 schrieb Mike Lischke <mike.l...@googlemail.com>:
>
>> simple_id :
>> t1:BASED_ON^ (remark_set)?
>> { #t1.setType(SIMPLE_ID);
>
> This is obviously a reference to token t1 and sets a type. Use "{ $type = SIMPLE_ID; }" now.

Ah, gave you the text for a lexer rule, which doesn't work here. In fact I'm not sure if you even can change the type of token in a parser rule that way. Maybe try $t1.setType(SIMPLE_ID), but I seriously doubt that is the original intention. Looks more like an attempt to rewrite the tree for the simple_id rule. So maybe do that instead (-> ^(SIMPLE_ID BASED_ON remark_set?) or leave out BASED_ON if the meaning is to change the tree that way).

Mike
--
www.soft-gems.net

Maxim Degtyarev

unread,
Nov 30, 2015, 1:07:06 PM11/30/15
to antlr-di...@googlegroups.com
According to the
http://www.antlr2.org/javadoc/antlr/actions/java/ActionLexer.html
token "##" is a shorthand for "currentRule_AST".
Probably it is the same as a "$$" in yacc/bison.

2015-11-30 20:09 GMT+03:00 'Mike Lischke' via antlr-discussion
<antlr-di...@googlegroups.com>:
> --
> You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Duncan Gibson

unread,
Dec 1, 2015, 5:20:13 AM12/1/15
to antlr-discussion
Maxim Degtyarev wrote:
According to the http://www.antlr2.org/javadoc/antlr/actions/java/ActionLexer.html
token "##" is a shorthand for "currentRule_AST".
Probably it is the same as a "$$" in yacc/bison.

Wow! One month on stackoverflow and only 20-odd views, probably mostly
myself checking for answers, but one evening here and I have answers :-)

Now that I have seen the answer it is blindingly obvious. I've checked back in
the code and found most rules don't have alternatives as their first level, and
these all use the ## to refer to the tree node itself, and then set info such as the
line/column by using #id of a particular token, or ##.getFirstChild(). There are
some rules like the "simple_id" example above with alternatives at the top-level
and which use #id to discriminate each branch.

OK. Now it starts to make a bit more sense...

Cheers
Duncan

Reply all
Reply to author
Forward
0 new messages