Problem with grammar

48 views

Skip to first unread message

Butz Wonker

unread,

Nov 11, 2018, 10:28:50 AM11/11/18

to antlr-discussion

The grammar below yields an error for the input "Person Address=%John%" (without quotes). It lexes the tokens correctly, but then returns that it expects an ID:

ID ("Person")ID ("Address")
EQ ("=")
STRING ("%John%")
line 1:21 mismatched input '<EOF>' expecting ID

How can it expect an ID after searchclause has been parsed? It should expect EOF. I'm totally new to ANTLR, so I'm sure it's my error, but what am I doing wrong?

Is the reason that NOT_SPECIAL+ matches EOF? I tried putting EOF in the set but this is not supported.

/* ANTLR Grammar for Minidb Query Language */

grammar Mdb;

start
    : searchclause EOF
    ;

searchclause
    : table expr
    ;

expr
    : fieldsearch
    | unop fieldsearch
    | LPAREN expr relop expr RPAREN
    ;

unop
    : NOT
    ;

relop
    : AND
    | OR
    ;

fieldsearch
    : field EQ searchterm
    ;

field
    : ID
    ;

table
    : ID
    ;

searchterm
    : ID
    |STRING
    ;

AND
    : 'and'
    ;

OR
    : 'or'
    ;

NOT
    : 'not'
    ;
EQ
    : '='
    ;

LPAREN
    : '('
    ;

RPAREN
    : ')'
    ;

fragment VALID_ID_START
    : ('a' .. 'z') | ('A' .. 'Z') | '_'
    ;

fragment VALID_ID_CHAR
    : VALID_ID_START | ('0' .. '9')
    ;

NOT_SPECIAL
    : ~(' ' | '\t' | '\n' | '\r' | '\'' | '"' | ';' | '.' | '=' )
    ;

ID
    : VALID_ID_START VALID_ID_CHAR*
    ;

STRING
    : NOT_SPECIAL+
    | '"' ~('\n'|'"')* ('"'
    | { panic("syntax-error - unterminated string literal") } )
    ;

WS
   : [ \r\n\t] + -> skip
;

Eric R

unread,

Nov 13, 2018, 10:02:30 AM11/13/18

to antlr-discussion

Nevermind, as someone on Stackoverflow has pointed out, there is nothing wrong with the grammar. I used the Go target and displayed the tokens as follows:

    is := antlr.NewInputStream(s)

    // Create the Lexer
    lexer := parser.NewMdbLexer(is)

    stream := antlr.NewCommonTokenStream(lexer, antlr.TokenDefaultChannel)

    for {
        t := lexer.NextToken()
        if t.GetTokenType() == antlr.TokenEOF {
            break
        }
        fmt.Printf("%s (%q)\n",
            lexer.SymbolicNames[t.GetTokenType()], t.GetText())
    }

But then I forgot to reset the token stream, so when I created the parser it already pointed to EOF, which explains the error message.

Reply all

Reply to author

Forward

0 new messages