Strange error message

28 views

Skip to first unread message

vladimir kozhaev

unread,

Aug 12, 2016, 7:26:30 AM8/12/16

to antlr-discussion

Hi all, I have created following grammar. Please, see it bellow

lexer grammar FileTriggerLexer;

@header { 
 }

STEP
:
	'/' INTEGER
;

SCHEDULE
:
	'Schedule'
;

SEMICOLON
:
	';'
;

ASTERISK
:
	'*'
;

CRON
:
	'cron'
;

MARKET_CRON
:
	'marketCron'
;

COMBINED
:
	'combined'
;

FILE_FEED
:
	'FileFeed'
;

LBRACKET
:
	'('
;

RBRACKET
:
	')'
;

PERCENT
:
	'%'
;

INTEGER
:
	[0-9]+
;

MINUTES_INTERVAL
:
	[1-59]
;

HOURS_INTERVAL
:
	[0-23]
;

WEEK_DAYS_INTERVAL
:
	[1-7]
;

MONTH_INTERVAL
:
	[1-12]
;

DAYS_OF_MONTH_INTERVAL
:
	[1-31]
;

DASH
:
	'-'
;

NUMBER
:
	[0-9]+
;

 

DOUBLE_QUOTE
:
	'"'
;
 
QUOTE
:
	'\''
;

SLASH
:
	'/'
;

DOT
:
	'.'
;

COMMA
:
	','
;

UNDERSCORE
:
	'_'
;

ID
:
	[a-zA-Z] [a-zA-Z0-9]*
; 



REGEX
:
	(
		ID
		| DOT
		| ASTERISK
		| NUMBER
		|PERCENT
	)+
;

WS
:
	[ \t\r\n]+ -> skip
; // skip spaces, tabs, newlines

 lexer grammar FileTriggerLexer;

@header { 
 }

STEP
:
	'/' INTEGER
;

SCHEDULE
:
	'Schedule'
;

SEMICOLON
:
	';'
;

ASTERISK
:
	'*'
;

CRON
:
	'cron'
;

MARKET_CRON
:
	'marketCron'
;

COMBINED
:
	'combined'
;

FILE_FEED
:
	'FileFeed'
;

LBRACKET
:
	'('
;

RBRACKET
:
	')'
;

PERCENT
:
	'%'
;

INTEGER
:
	[0-9]+
;

MINUTES_INTERVAL
:
	[1-59]
;

HOURS_INTERVAL
:
	[0-23]
;

WEEK_DAYS_INTERVAL
:
	[1-7]
;

MONTH_INTERVAL
:
	[1-12]
;

DAYS_OF_MONTH_INTERVAL
:
	[1-31]
;

DASH
:
	'-'
;

NUMBER
:
	[0-9]+
;

 

DOUBLE_QUOTE
:
	'"'
;
 
QUOTE
:
	'\''
;

SLASH
:
	'/'
;

DOT
:
	'.'
;

COMMA
:
	','
;

UNDERSCORE
:
	'_'
;

ID
:
	[a-zA-Z] [a-zA-Z0-9]*
; 



REGEX
:
	(
		ID
		| DOT
		| ASTERISK
		| NUMBER
		|PERCENT
	)+
;

WS
:
	[ \t\r\n]+ -> skip
; // skip spaces, tabs, newlines

And followng code to process it

package com.idc.omega.validation.spin;

import java.util.Arrays;
import java.util.Collections;
import java.util.LinkedList;
import java.util.List;
import java.util.regex.Pattern;

import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.BaseErrorListener;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.Parser;
import org.antlr.v4.runtime.ParserRuleContext;
import org.antlr.v4.runtime.RecognitionException;
import org.antlr.v4.runtime.Recognizer;
import org.antlr.v4.runtime.tree.ParseTreeWalker;

import com.idc.omega.validation.spin.FileTriggerValidatorParser.Source_fileContext;

public class FileGeneratorValdiator {

	protected static final Pattern SRC_PATTERN = Pattern.compile(
			"([A-Z]-)?[A-Z_]+(/[A-Z_]+)?"
		);

	
	public static class VerboseListener extends BaseErrorListener {
		List<String> errorMessagesList = new LinkedList<String>();

		@Override
		public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine,
				String msg, RecognitionException e) {
			List<String> stack = ((Parser) recognizer).getRuleInvocationStack();
			Collections.reverse(stack);

			errorMessagesList.add(Arrays.toString(recognizer.getRuleNames())+"line " + line + ":" + charPositionInLine + ": " + msg);
		}

		public List<String> getErrorMessagesList() {
			return errorMessagesList;
		}

	}

	public class PostParsingValidator extends FileTriggerValidatorBaseListener {
		List<String> errorMessagesList = new LinkedList<String>();
		public List<String> getErrorMessagesList() {
			return errorMessagesList;
		}

		@Override
		public void enterEveryRule(ParserRuleContext arg0) {
			System.out.println("rule name:" + FileTriggerValidatorParser.ruleNames[arg0.getRuleIndex()] + ", text:"
					+ arg0.getText());

		}

		@Override
		public void enterSource_file(Source_fileContext ctx) {
			if(!SRC_PATTERN.matcher(ctx.getText()).matches()){
				errorMessagesList.add(""+ctx.getAltNumber()+ " incorrect file context format:0"+ctx.getText());
			}
		}
		

				
	};

	public List<String> validate(String validateString) {

		// Get our lexer
		FileTriggerLexer lexer = new FileTriggerLexer(new ANTLRInputStream(validateString));

		// Get a list of matched tokens
		CommonTokenStream tokens = new CommonTokenStream(lexer);

		// Pass the tokens to the parser
		FileTriggerValidatorParser parser = new FileTriggerValidatorParser(tokens);

		VerboseListener verboseListener = new VerboseListener();
		parser.addErrorListener(verboseListener);
		// Specify our entry point
		ParserRuleContext drinkSentenceContext = parser.r();

		// Walk it and attach our listener
		ParseTreeWalker walker = new ParseTreeWalker();

		PostParsingValidator listener = new PostParsingValidator();
		walker.walk(listener, drinkSentenceContext);
		List<String> errors=new LinkedList<String>();
		errors.addAll(verboseListener.getErrorMessagesList());
		errors.addAll(listener.getErrorMessagesList());
		return errors;
	}
}

And when I try to parse following string "Schedule;cron(\"*/3 * * * * America/New_York\");'TestFile'yyyy-M-dd-HH-mm;America/New_York;20"

have error message

line 1:89: mismatched input '20' expecting NUMBER

It's strange as for me, as in my point of view NUMBER :[0-9]+; will allow "20"

Regards,

Vlladimir

vladimir kozhaev

unread,

Aug 12, 2016, 9:46:24 AM8/12/16

to antlr-discussion

Sorry, second part of the grammar is

/**
 * Define a grammar called Hello
 */
grammar FileTriggerValidator;

options
   {
	tokenVocab = FileTriggerLexer;
}

r
:
	(schedule
	| file_feed)+
;

expression
:
	schedule
	| file_feed
;

file_feed
:
	file_feed_name SEMICOLON source_file SEMICOLON source_host SEMICOLON
	source_host SEMICOLON regEx SEMICOLON regEx
	(
		SEMICOLON source_host
	)*
;

formatString
:
	source_host
	(
		'%' source_host?
	)* DOT source_host
;

regEx
:
	REGEX
;

source_host
:
	ID
	(
		DASH ID
	)*
;

file_feed_name
:
	FILE_FEED
;

source_file
:
	(
		ID
		| DASH
		| UNDERSCORE
	)+
;

schedule
:
	SCHEDULE SEMICOLON schedule_defining SEMICOLON file_name SEMICOLON timezone
	
	(
		SEMICOLON NUMBER
	)?
;

schedule_defining
:
	cron
	| market_cron
	| combined_cron
;

cron
:
	CRON LBRACKET DOUBLE_QUOTE cron_part timezone DOUBLE_QUOTE RBRACKET
;

market_cron
:
	MARKET_CRON LBRACKET DOUBLE_QUOTE cron_part timezone DOUBLE_QUOTE COMMA
	DOUBLE_QUOTE ID DOUBLE_QUOTE RBRACKET
;

combined_cron
:
	COMBINED LBRACKET cron_list_element
	(
		COMMA cron_list_element
	)* RBRACKET
;

mic_defining
:
	ID
;

file_name
:
	(
		ID
		| DOT
		| QUOTE
		| DASH
	)+
;

cron_list_element
:
	cron
	| market_cron
;
//

schedule_defined_string
:
	cron
;
// 

cron_part
:
	minutes hours days_of_month month week_days
;
//

minutes
:
	MINUTES_INTERVAL
	| with_step_value
;
//

hours
:
	HOURS_INTERVAL
	| with_step_value
;
//

int_list
:
	INTEGER
	(
		COMMA INTEGER
	)*
;

interval
:
	INTEGER DASH INTEGER
;
//

days_of_month
:
	DAYS_OF_MONTH_INTERVAL
	| with_step_value
;
//

month
:
	MONTH_INTERVAL
	| with_step_value
;
//

week_days
:
	WEEK_DAYS_INTERVAL
	| with_step_value
;
//

timezone
:
	timezone_part
	(
		SLASH timezone_part
	)?
;
//

timezone_part
:
	ID
	(
		UNDERSCORE ID
	)?
;
//

with_step_value
:
	(
		int_list
		| interval
		| ASTERISK
	) STEP?
;
//

Kevin Cummings

unread,

Aug 12, 2016, 9:50:10 AM8/12/16

to antlr-di...@googlegroups.com

I don't think the following intervals are doing what you want them to:

> MINUTES_INTERVAL
> :
> [1-59]
> ;
>
> HOURS_INTERVAL
> :
> [0-23]
> ;
>
> WEEK_DAYS_INTERVAL
> :
> [1-7]
> ;
>
> MONTH_INTERVAL
> :
> [1-12]
> ;
>
> DAYS_OF_MONTH_INTERVAL
> :
> [1-31]
> ;

Look at your ID rule for comparison.

> DASH
> :
> '-'
> ;

Also, your INTEGER and your NUMBER rules lex the same text. One of them
will be ignored. First come, first lexed.

> --
> You received this message because you are subscribed to the Google
> Groups "antlr-discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to antlr-discussi...@googlegroups.com
> <mailto:antlr-discussi...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--
Kevin J. Cummings
kjc...@verizon.net
cumm...@kjchome.homeip.net
cumm...@kjc386.framingham.ma.us
Registered Linux User #1232 (http://www.linuxcounter.net/)

Reply all

Reply to author

Forward

0 new messages