How to keep the unicity of one information in a flow of many informations?

9 views
Skip to first unread message

Amaury Billet

unread,
Mar 13, 2015, 9:53:47 AM3/13/15
to antlr-di...@googlegroups.com
Hello everyone,

I came here because a problem came to me:
I'm actually trying to parse an EDIF file (electrical schema description). I wrote the antlr grammar from the BNF file of the EDIF standard.
Lots of rules in the BNF are define like the following example :

behaviorView ::= '(' 'behaviorView'
    viewNameDef
    {  comment  |  <nameInformation>  |  userData  }
    ')'

Angled brackets "< >" signify a construct that may occur at most once within the immediately enclosing syntactic grouping construct. For example, { A | <B> } represents all sequences of zero or more A's that contain at most one B; that is, A, A A, B, A A B A, B A A A A, and the empty sequence are all valid sequences.

I wrote in the .g4 file this rule below :


behaviorView : '(behaviorView'
    viewNameDef
    (  comment  |  nameInformation | userData  )*
    ')';


However, by writing the rule in this way, I lose the unicity of the "nameInformation'. One solution that looked logicial was to write :

behaviorView : '(behaviorView'
    viewNameDef
    (  comment  | (nameInformation)? | userData  )*
    ')';


but antlr Tool doesn't accept it ("Error : the rule "behaviorView can match an empty string").

Have you ever tried to solve this kind of problem?
Any solutions to keep the unicity?

Thanks! ;)

Eric Vergnaud

unread,
Mar 13, 2015, 10:52:50 AM3/13/15
to antlr-di...@googlegroups.com
Hi,

am I correct in thinking that you could solve this as follows:

behaviorView : '(behaviorView'
    viewNameDef
    commentOrNameorData
    ')';


commentOrNameorData :
   commentOrData+ nameInformation commentOrData*
  | commentOrData+ 
  ;

commentOrData:
 comment
 | userData
 ;

Amaury Billet

unread,
Mar 13, 2015, 11:10:38 AM3/13/15
to antlr-di...@googlegroups.com
Well, thanks for the help, and it works perfectly for this example. However, I should have specify that some rules are much complicated than this one. For example :

schematicInterconnectAttributeDisplay ::= '(' 'schematicInterconnectAttributeDisplay'
    {  <connectivityTagGeneratorDisplay>  |  <criticalityDisplay>  |  interconnectAttachedText  |  interconnectDelayDisplay  |  <interconnectNameDisplay>  |  interconnectPropertyDisplay  }
    ')'

And it's not an isolated case.
Moreover, by using this method, the parsing time become muuuch longer. Maybe there is a compromise to find..

Eric Vergnaud

unread,
Mar 13, 2015, 12:01:31 PM3/13/15
to antlr-di...@googlegroups.com
It seems to me that those requirements go beyond a parser's typical mission, and start falling under a post parsing checker.
(if you think of a java compiler, it does not check the validity of identifiers at parse time, but rather at compile time)
I would kind of drop those requirements and simply check for those rules using a visitor after parsing.

Amaury Billet

unread,
Mar 13, 2015, 12:21:32 PM3/13/15
to antlr-di...@googlegroups.com
This is what I was thinking. I tried lots of solutions but no one suits.
Thanks again for the help. ;)
Reply all
Reply to author
Forward
0 new messages