I am new to antlr3 - why do I get lexer errors while antlr3Works can parse the same sample code?

39 views
Skip to first unread message

Lothar Behrens

unread,
Apr 12, 2015, 4:44:22 AM4/12/15
to antlr-di...@googlegroups.com
Hi,

I am new to antlr3 and have chosen to use the 3 version due to it's C support.

Currently I am using ANTLR v3.5 January 4, 2013 and it's C runtime.

ANTLRWorks version is 1.5 (ANTLR 3.5, StringTemplate v3 3.2.1, StringTemplate v4 4.0.7-SNAPSHOT)

I was able to create a simple expression grammar that I could parse within the integrated C antlr3 runtime.

My language should start as a declarative for now and I am able to test it within antlr3Works.

When I generate the code, I'll get many warnings, but I am unfamiliar with it.

This is the sample ui declaration:

ui lbdmf "lbDMF Manager"
declare data
infer todo
end data
declare forms
default fieldtype text
form anwendungen "Anwendungen"
use data.anwendungen a default
field title "Titel"
field name "Name" shows a
field desc "Description" shows a.description
field requirement "Rquirement" as richtext
field application_type "Type" refers at using type_name over at.id
end form
end forms
end ui

The code compiles and compared to a simple expression parser I used before this one,
this language gives me errors like these:

lbUIDsl::init() called.
Parser inits for ui lbdmf "lbDMF Manager"
declare data...

[snip]

Parser parses...
ABCD(1) : lexer error 3 :
     at offset 3, near ' ' :
     lbdmf "lbDMF Manage
ABCD(1) : lexer error 3 :
     at offset 9, near ' ' :
     "lbDMF Manager"
dec
ABCD(1) : lexer error 3 :
     at offset 25, near char(0XA) :

Any help?

Thanks,

Lothar



This is my grammar file:

grammar ui;
 
options {
    language=C;
    output=AST;
}
 
/*
tokens {
    PLUS    = '+' ;
    MINUS   = '-' ;
    MULT    = '*' ;
    DIV     = '/' ;   
    UI      = 'ui' ;
    ENDUI   = 'end ui' ;
}
*/
 
@lexer::header {
#define Hidden 99
}
 
@header {
#define _empty NULL
}
 
@members {

}
 
/*------------------------------------------------------------------
 * PARSER RULES
 *------------------------------------------------------------------*/

ui    : 'ui' identifier STRING (declare_data)? (declare_forms)? 'end ui'
    ;
   
declare_forms
    :    'declare forms' (form)* 'end forms'
    |    'declare forms' (default_fieldtype) (form)* 'end forms'
    ;

default_fieldtype
    :    'default fieldtype' identifier
    ;


form    :    'form' identifier STRING (use_data)* (field)* 'end form'
    ;

field    :    'field' identifier STRING 'shows' identifier ('.' identifier)? ('as' identifier)?
    |    'field' identifier STRING 'as' identifier
    |    'field' identifier STRING 'refers' identifier 'using' identifier 'over' identifier '.' identifier
    |    'field' identifier STRING
    ;

use_data:    'use data.' identifier identifier ('default')?
    ;

declare_data
    :    'declare data' (infer | use)? 'end data'
    ;

use    :    'use' STRING
    ;

infer    :    'infer' 'yes'
// Non existing is no
//    |    'infer' 'no'
    |    'infer' 'todo'
    ;
   
/*
expr    : term ( ( PLUS | MINUS )  term )* ;
 
term    : factor ( ( MULT | DIV ) factor )* ;
 
factor  : NUMBER ;
*/

identifier
    :    ALPHA
    |    ALPHA NUMBER
    ;

/*------------------------------------------------------------------
 * LEXER RULES
 *------------------------------------------------------------------*/

//string_guts
//      : (~'"')*
// I do not really need escapes yet, so skip it due to antlrWorks bugs     
//    | (ESC | ~('"' | '\\'))*
//      ;

// also a fragment rule perhaps?
fragment ESC
  :  ('\\' 's')
  |  ('\\' 'x')
  |  ('\\' 'd')
  |  ('\\' 'l')
  |  ('\\' '\\')
  ;

STRING    
  :  '"' (~'"')* '"'
  ;
 
NUMBER  : (DIGIT)+ ;

ALPHA
    :    ('a'..'z' | 'A'..'Z' | '_')+ ;

//STRINGCHAR
//    :    (~'"')*
//    ;

fragment WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+    { $channel = Hidden; } ;
 
fragment DIGIT  : '0'..'9' ;

Lothar Behrens

unread,
Apr 12, 2015, 7:49:25 AM4/12/15
to antlr-di...@googlegroups.com
Hi,

after trials with anything, I found that the WHITESPACE rule should not be a fragment. This solved the errors within the lexer.
Reply all
Reply to author
Forward
0 new messages