How to keep the unicity of one information in a flow of many informations?

Amaury Billet

unread,

Mar 13, 2015, 9:53:47 AM3/13/15

to antlr-di...@googlegroups.com

Hello everyone,

I came here because a problem came to me:
I'm actually trying to parse an EDIF file (electrical schema description). I wrote the antlr grammar from the BNF file of the EDIF standard.
Lots of rules in the BNF are define like the following example :

behaviorView ::= '(' 'behaviorView'
    viewNameDef
    { comment | <nameInformation> | userData }
    ')'

Angled brackets "< >" signify a construct that may occur at most once within the immediately enclosing syntactic grouping construct. For example, { A | <B> } represents all sequences of zero or more A's that contain at most one B; that is, A, A A, B, A A B A, B A A A A, and the empty sequence are all valid sequences.

I wrote in the .g4 file this rule below :

behaviorView : '(behaviorView'
    viewNameDef
    ( comment | nameInformation | userData )*
    ')';

However, by writing the rule in this way, I lose the unicity of the "nameInformation'. One solution that looked logicial was to write :

behaviorView : '(behaviorView'
    viewNameDef
    ( comment | (nameInformation)? | userData )*
    ')';

but antlr Tool doesn't accept it ("Error : the rule "behaviorView can match an empty string").

Have you ever tried to solve this kind of problem?
Any solutions to keep the unicity?

Thanks! ;)

Eric Vergnaud

unread,

Mar 13, 2015, 10:52:50 AM3/13/15

to antlr-di...@googlegroups.com

Hi,

am I correct in thinking that you could solve this as follows:

behaviorView : '(behaviorView'
viewNameDef
commentOrNameorData
')';

commentOrNameorData :

commentOrData+ nameInformation commentOrData*

| commentOrData+

;

commentOrData:

comment

| userData

;

Amaury Billet

unread,

Mar 13, 2015, 11:10:38 AM3/13/15

to antlr-di...@googlegroups.com

Well, thanks for the help, and it works perfectly for this example. However, I should have specify that some rules are much complicated than this one. For example :

schematicInterconnectAttributeDisplay ::= '(' 'schematicInterconnectAttributeDisplay'
{ <connectivityTagGeneratorDisplay> | <criticalityDisplay> | interconnectAttachedText | interconnectDelayDisplay | <interconnectNameDisplay> | interconnectPropertyDisplay }
')'

And it's not an isolated case.
Moreover, by using this method, the parsing time become muuuch longer. Maybe there is a compromise to find..

Eric Vergnaud

unread,

Mar 13, 2015, 12:01:31 PM3/13/15

to antlr-di...@googlegroups.com

It seems to me that those requirements go beyond a parser's typical mission, and start falling under a post parsing checker.

(if you think of a java compiler, it does not check the validity of identifiers at parse time, but rather at compile time)

I would kind of drop those requirements and simply check for those rules using a visitor after parsing.

Amaury Billet

unread,

Mar 13, 2015, 12:21:32 PM3/13/15

to antlr-di...@googlegroups.com

This is what I was thinking. I tried lots of solutions but no one suits.
Thanks again for the help. ;)

Reply all

Reply to author

Forward