ANTLR4 sending a null context attribute to listener

151 views
Skip to first unread message

Daniel Ricardo Castro Alvarado

unread,
Apr 11, 2015, 9:12:05 PM4/11/15
to antlr-di...@googlegroups.com
Hi everybody.

I asked this on StackOverflow, but I thought I could get more help here.

I'm creating a simple language compiler, and I'm facing an unexpected behavior. I simplified the grammar as follows:

    grammar Language;
   
    program        
: (varDecl)* (funcDecl)* EOF;
    varDecl        
: type IDENTIFIER ('=' expression)? ';';
    funcDecl        
: type IDENTIFIER '(' ')' statementBlock;
    type            
: 'int' # IntType
                   
;
   
    statementBlock  
: '{' (statement)* '}';
    statement      
: varDecl ;
   
    expression      
: IDENTIFIER '(' (expression (',' expression)*)? ')' # FuncCallExpression
                   
;
   
    IDENTIFIER      
: ('a'..'z')+;
    WHITE_SPACE    
: [ \t\u000C\n\r]+ -> skip;

As statementBlock is a mandatory rule inside the funcDecl rule, I would expect that, inside a listener, the FuncDeclContext always contains a non-null funcDecl. The problem is I'm getting a null funcDecl for the following input:

    int b() {
   
}
    i nt a
() {
     
int x = b();
   
}

As far as I understand, when facing an invalid input, ANTLR inserts special nodes that represent the expected matching (like the <missing ID> example from page 163 of the book), but somehow that's not what's happening here (is it a bug?). When I use the following listener, I get "Oh no :(":

 public class DummyListener extends LanguageBaseListener {
 @Override
 
public void exitFuncDecl(LanguageParser.FuncDeclContext ctx) {
 super.exitFuncDecl(ctx);
 
 
if (ctx.statementBlock() == null) {
 System.out.println("Oh, no :(");
 }
 }
 }


Changing funcDecl to include an action (in an attempt to debug this behavior):

    funcDecl        : type IDENTIFIER '(' ')' statementBlock { System.out.println("ID: " + $IDENTIFIER.text + ", text is: " + $statementBlock.text); };

and modifying exitFuncDecl from the listener to print the identifier:

   
System.out.println("Listener: id " + ctx.IDENTIFIER().getText());
   
if (ctx.statementBlock() == null) {
       
System.out.println("Oh, no :(");
   
} else {
       
System.out.println("content is " + ctx.statementBlock().getText());
   
}

shows this output:

    line 3:0 extraneous input 'i' expecting {<EOF>, 'int'}
    ID
: b, text is: {}
    line
4:7 mismatched input '=' expecting '('
   
Listener: id b
    content
is {}
   
Listener: id x
   
Oh, no :(

It appears that ANTLR is calling exitFuncDecl but not the rule action. I think the rule action behavior is the right one here, as "x" is causing the null statementBlock.
What is the reason of this behavior?

Thank you very much

Eric Vergnaud

unread,
Apr 12, 2015, 3:09:51 AM4/12/15
to antlr-di...@googlegroups.com
Hi,

antlr will only instantiate a context if it encounters a matching input.
You are correct in expecting that when a listener callback is called, the corresponding context is non null.
You are however incorrect that every member of that context instance is non null.
This is key to simplified parsing: the parser never fails, even if doesn't match anything, and it is much easier to process failures after parsing than during parsing.

You are correct in expecting that antlr will tentatively insert missing tokens or remove extra ones.
However, this is only true for tokens, not for characters, or in other words for grammar rules, not lexer rules. 
Your input i matches the IDENTIFIER token rule and antlr cannot repair tokens, only grammar rules.

Eric

Daniel Ricardo Castro Alvarado

unread,
Apr 12, 2015, 3:42:03 AM4/12/15
to antlr-di...@googlegroups.com

Thanks for your answer Eric,

But then why isn't ANTLR calling the action rule? It's only calling the listener method.
Also, if the input turned to be invalid, why does it make sense to create a node for a rule it didn't match?.

I was wondering what would be the right way to handle this kind of cases. Checking for null attributes in every method if the listener doesn't look quite right to me

Thank you very much

--
You received this message because you are subscribed to a topic in the Google Groups "antlr-discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/antlr-discussion/-FocT-iVNoc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Eric Vergnaud

unread,
Apr 12, 2015, 11:42:30 AM4/12/15
to antlr-di...@googlegroups.com
your program rule makes it clear that a funcDecl can only be followed by a funcDecl.
after successful parsing of int b() {}, antlr looks for the next funcDecl, which must begin with int followed by IDENTIFIER
it skips all invalid tokens until it matches int b on line 4, then it complains from the unexpected = (as shown by the logs)
So it never reaches again a point where the action rule would be called.
The way to manage errors is to install an error listener.
If there is no error, you can safely walk through the tree.
But if you want to walk through the tree even with parsing errors, then you need to expect the unexpected.

Eric

Daniel Ricardo Castro Alvarado

unread,
Apr 12, 2015, 12:12:41 PM4/12/15
to antlr-di...@googlegroups.com
I still find a bit odd that it generates an incomplete FuncDeclContext. Let's suppose I change the program rule as folows:
program : (varDecl|funcDecl)* EOF;

With the following input, it produces the same result:

 
int b() {
 
}
 i nt a
() {

 
int x = ;
 
}


Now, if it doesn't have parenthesis (x) nor braces (not even added during error correction), hows does it know that it should be a FuncDeclContext and not a VarDeclContext?

Anyway, would it be safe to assume that a null exception inside the context implies I have a complete rule?

Thank you again
Reply all
Reply to author
Forward
0 new messages