Having a hard time with my first grammar

34 views
Skip to first unread message

Joe Bryant

unread,
Dec 19, 2017, 11:16:36 AM12/19/17
to antlr-discussion
I am creating my first grammar, and I thought it was pretty simple. It is a line oriented data file I need to parse, and I am stuck on getting the first line to be recognized. Assuming the first line contains....


KEYWORD:18.2

And my grammar has:

grammar PCG;

character : (line? EOL) + ;

line : id ':' DECIMAL
| id ':' ANY
| BLANK_LINE
| COMMENT_LINE
| id '|' DECIMAL
;

id : ID ;

ANY : ~[\r\n]+;

BLANK_LINE : [ \t]+;

TAG : [a-zA-Z]+;

COMMENT_LINE : '#' ~[\r\n]*;

EOL : [\r\n]+ ->skip;

DECIMAL : ([0-9]+ ('.' [0-9]+)?)
| ('.' [0-9]+)
;

ID : [a-zA-Z]+ ;


I always get the message :

line 1:0 mismatched input 'KEYWORD:18.2' expecting {BLANK_LINE, COMMENT_LINE, EOL, ID}

Any help would be appreciated to get me over my startup hump.

Jeff Saremi

unread,
Dec 19, 2017, 11:34:21 PM12/19/17
to antlr-discussion
Joe,
I too am a starter. I am going to give you an advice that someone gave me on this forum.
Forget about all your parsing rules.
Write a grammar for lexer only. (lexer grammar MyGrammar;) which contains only the uppercase rules above
Inside your test runner, add a block that would print out all your tokens after scanning.
Example in C#:

            var lexer = new MyPreProcessorLexer(new AntlrInputStream(input));
           
var tokenStream = new CommonTokenStream(lexer);
            tokenStream
.Fill();
           
foreach (var token in tokenStream.GetTokens())
           
{
               
Console.WriteLine(token);
           
}


Then examine your token types/values with your input letter by letter and word by word. See if they come out in the order expected and the type you wanted them.
Until you are 100% satisfied do not move to the parsing stage.
Jeff

Joe Bryant

unread,
Dec 20, 2017, 10:07:42 AM12/20/17
to antlr-discussion
That was amazing advice, and it is indeed helping. Thank you very much.
Reply all
Reply to author
Forward
0 new messages