Beginners syntax problems

846 views
Skip to first unread message

Jochen Wiedmann

unread,
Nov 20, 2013, 3:01:24 AM11/20/13
to antlr-di...@googlegroups.com
Sorry, for the beginners questions:

I am trying to parse a file that looks like the following (Only the first few lines, for brevity):

0010/** DIALOG SOURCE 22D
0020/*[ DEFINE DIALOG INFO
0030**D* NATURAL Dialog Description 6.3.13.0 / 2013-08-14 13:38
0040/** EMPTY DIALOG COMMENT
0050/*] END-DIALOG-INFO


The language is, of course, a night mare. (Just to give an example, note the "/*[" in line 2, basically a comment introducer, followed by the obviously important words "DEFINE" "DIALOG" "INFO". I didn't make that language, I am attempting to parse it.)

After quite some thinking, I came to the conclusion that my only chance to handle this, would be the following approach:

   1 Line = 1 Token

This may be questionable for most of you, but it got me started, so I am currently happy with that, and I got a manually written lexer/parser working within one week. Now I am at a point where the first grammar modifications are required. So, this is my second attempt to get things working with AntLR. (See my grammar below.)
However, I am getting the following error messages, and am unable to deal with them (In particular, the line=-1, charPosition=-1, is puzzling me):

[ERROR] Message{errorType=SYNTAX_ERROR, args=[mismatched character '\n' expecting ']'], e=MismatchedTokenException(10!=93), fileName='/home/jwi/workspace/ns3mod-parser-antlr/src/main/antlr4/ns3mod.g4', line=-1, charPosition=-1}
[ERROR] Message{errorType=SYNTAX_ERROR, args=[unterminated rule (missing ';') detected at 'COMMENTLINE1 :' while looking for lexer rule element], e=org.antlr.v4.parse.v4ParserException, fileName='/home/jwi/workspace/ns3mod-parser-antlr/src/main/antlr4/ns3mod.g4', line=14, charPosition=0}
[ERROR] Message{errorType=SYNTAX_ERROR, args=[unterminated rule (missing ';') detected at 'DIALOG_SOURCE :' while looking for lexer rule element], e=org.antlr.v4.parse.v4ParserException, fileName='/home/jwi/workspace/ns3mod-parser-antlr/src/main/antlr4/ns3mod.g4', line=19, charPosition=0}
[ERROR] Message{errorType=SYNTAX_ERROR, args=['"' came as a complete surprise to me], e=null, fileName='/home/jwi/workspace/ns3mod-parser-antlr/src/main/antlr4/ns3mod.g4', line=19, charPosition=23}
[ERROR] Message{errorType=SYNTAX_ERROR, args=['/**" "DIALOG" "SOURCE" "22D" EOL;\nDEFINE_DIALOG_INFO: LINENUM "/*[" "DEFINE" "DIALOG" "INFO";\nEND_DIALOG_INFO: LINENUM "/*]" "END-DIALOG-INFO" EOL;\nLF : '\u000A';\nCR : '\u000D';\nTEXT: ~[\\u000A\u000D];\n' came as a complete surprise to me while looking for lexer rule element], e=NoViableAltException(17@[]), fileName='/home/jwi/workspace/ns3mod-parser-antlr/src/main/antlr4/ns3mod.g4', line=19, charPosition=24}


grammar Ns3mod ;

dialog_source:
  dialog_header ;
 
dialog_header :
  DIALOG_source dialog_info? ;

dialog_info :
  DEFINE_DIALOG_INFO END_DIALOG_INFO ;


WS : [ \t\]+ -> skip ;
COMMENTLINE1: LINENUM '**D*' TEXT* EOL ;
COMMENTLINE2: LINENUM '/*' TEXT* EOL ;
COMMENT: (COMMENTLINE1 | COMMENTLINE2) -> skip ;
LINENUM: ([0-9])+ ;
EOL: CR?LF
DIALOG_SOURCE: LINENUM "/**" "DIALOG" "SOURCE" "22D" EOL;
DEFINE_DIALOG_INFO: LINENUM "/*[" "DEFINE" "DIALOG" "INFO";
END_DIALOG_INFO: LINENUM "/*]" "END-DIALOG-INFO" EOL;
LF : '\u000A';
CR : '\u000D';
TEXT: ~[\\u000A\u000D];

Bence Erős

unread,
Nov 20, 2013, 5:22:54 AM11/20/13
to antlr-di...@googlegroups.com
Hello,

it seems to me there are many syntax errors in your grammar file. Though when I tried to generate a parser with antlr-4.1 it gave pretty different error messages.
Anyway, a fixed version:

grammar Ns3mod ;

dialog_source:
  dialog_header ;
  
dialog_header :
  DIALOG_source dialog_info? ;

dialog_info :
  DEFINE_DIALOG_INFO END_DIALOG_INFO ;

WS : [ \t]+ -> skip ;
COMMENTLINE1: LINENUM '**D*' TEXT* EOL ;
COMMENTLINE2: LINENUM '/*' TEXT* EOL ;
COMMENT: (COMMENTLINE1 | COMMENTLINE2) -> skip ;
LINENUM: ([0-9])+ ;
EOL: CR? LF ;
DIALOG_source: LINENUM '/**' 'DIALOG' 'SOURCE' '22D' EOL;
DEFINE_DIALOG_INFO: LINENUM '/*[' 'DEFINE' 'DIALOG' 'INFO';
END_DIALOG_INFO: LINENUM '/*]' 'END-DIALOG-INFO' EOL;
LF : '\u000A';
CR : '\u000D';
TEXT: ~[\\u000A\u000D];


At least it compiles. Not quite sure if it fits your needs.


regards,


2013/11/20 Jochen Wiedmann <jochen....@gmail.com>

--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Bence Erős
CyclonePHP
core developer

Eric

unread,
Nov 20, 2013, 6:24:09 AM11/20/13
to antlr-di...@googlegroups.com
This is not an answer but a question out of curiosity.

What language is this? Is it NATURAL or Dialog? Googling for "Natural programming language" or "Dialog programming language" does not turn up info on a programming languages that look like your sample.

When working with languages sometimes I create the AST first and then hand work through some problems using just the AST and not the human grammar. Once I understand the semantics of the language I then work back to a grammar. Since I don't see enough context here for the language I would like to know more. Can you provide a link to the language specification or at least the BNF or a description of what it is and what is suppose to do?

Thanks.

Sourabh Gupta

unread,
Feb 7, 2017, 10:17:00 PM2/7/17
to antlr-di...@googlegroups.com, eben...@gmail.com
how you solve error of commentline1
Reply all
Reply to author
Forward
0 new messages