I am new to antlr3 - why do I get lexer errors while antlr3Works can parse the same sample code?

39 views

Skip to first unread message

Lothar Behrens

unread,

Apr 12, 2015, 4:44:22 AM4/12/15

to antlr-di...@googlegroups.com

Hi,

I am new to antlr3 and have chosen to use the 3 version due to it's C support.

Currently I am using ANTLR v3.5 January 4, 2013 and it's C runtime.

ANTLRWorks version is 1.5 (ANTLR 3.5, StringTemplate v3 3.2.1, StringTemplate v4 4.0.7-SNAPSHOT)

I was able to create a simple expression grammar that I could parse within the integrated C antlr3 runtime.

My language should start as a declarative for now and I am able to test it within antlr3Works.

When I generate the code, I'll get many warnings, but I am unfamiliar with it.

This is the sample ui declaration:

ui lbdmf "lbDMF Manager"
declare data
infer todo
end data
declare forms
default fieldtype text
form anwendungen "Anwendungen"
use data.anwendungen a default
field title "Titel"
field name "Name" shows a
field desc "Description" shows a.description
field requirement "Rquirement" as richtext
field application_type "Type" refers at using type_name over at.id
end form
end forms
end ui

The code compiles and compared to a simple expression parser I used before this one,
this language gives me errors like these:

lbUIDsl::init() called.
Parser inits for ui lbdmf "lbDMF Manager"
declare data...

[snip]

Parser parses...
ABCD(1) : lexer error 3 :
    at offset 3, near ' ' :
    lbdmf "lbDMF Manage
ABCD(1) : lexer error 3 :
    at offset 9, near ' ' :
    "lbDMF Manager"
dec
ABCD(1) : lexer error 3 :
    at offset 25, near char(0XA) :

Any help?

Thanks,

Lothar

This is my grammar file:

grammar ui;

options {
    language=C;
    output=AST;
}

/*
tokens {
    PLUS    = '+' ;
    MINUS   = '-' ;
    MULT    = '*' ;
    DIV     = '/' ;
    UI      = 'ui' ;
    ENDUI   = 'end ui' ;
}
*/

@lexer::header {
#define Hidden 99
}

@header {
#define _empty NULL
}

@members {

}

/*------------------------------------------------------------------
* PARSER RULES
*------------------------------------------------------------------*/

ui    : 'ui' identifier STRING (declare_data)? (declare_forms)? 'end ui'
    ;

declare_forms
    :    'declare forms' (form)* 'end forms'
    |    'declare forms' (default_fieldtype) (form)* 'end forms'
    ;

default_fieldtype
    :    'default fieldtype' identifier
    ;

form    :    'form' identifier STRING (use_data)* (field)* 'end form'
    ;

field    :    'field' identifier STRING 'shows' identifier ('.' identifier)? ('as' identifier)?
    |    'field' identifier STRING 'as' identifier
    |    'field' identifier STRING 'refers' identifier 'using' identifier 'over' identifier '.' identifier
    |    'field' identifier STRING
    ;

use_data:    'use data.' identifier identifier ('default')?
    ;

declare_data
    :    'declare data' (infer | use)? 'end data'
    ;

use    :    'use' STRING
    ;

infer    :    'infer' 'yes'
// Non existing is no
//    |    'infer' 'no'
    |    'infer' 'todo'
    ;

/*
expr    : term ( ( PLUS | MINUS ) term )* ;

term    : factor ( ( MULT | DIV ) factor )* ;

factor : NUMBER ;
*/

identifier
    :    ALPHA
    |    ALPHA NUMBER
    ;

/*------------------------------------------------------------------
* LEXER RULES
*------------------------------------------------------------------*/

//string_guts
//     : (~'"')*
// I do not really need escapes yet, so skip it due to antlrWorks bugs
//    | (ESC | ~('"' | '\\'))*
//     ;

// also a fragment rule perhaps?
fragment ESC
: ('\\' 's')
| ('\\' 'x')
| ('\\' 'd')
| ('\\' 'l')
| ('\\' '\\')
;

STRING
: '"' (~'"')* '"'
;

NUMBER : (DIGIT)+ ;

ALPHA
    :    ('a'..'z' | 'A'..'Z' | '_')+ ;

//STRINGCHAR
//    :    (~'"')*
//    ;

fragment WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+    { $channel = Hidden; } ;

fragment DIGIT : '0'..'9' ;

Lothar Behrens

unread,

Apr 12, 2015, 7:49:25 AM4/12/15

to antlr-di...@googlegroups.com

Hi,

after trials with anything, I found that the WHITESPACE rule should not be a fragment. This solved the errors within the lexer.

Reply all

Reply to author

Forward

0 new messages