Parsing data from neverending stream

26 views
Skip to first unread message

andrey....@gmail.com

unread,
Feb 13, 2013, 4:18:54 PM2/13/13
to antlr-di...@googlegroups.com
Is Antlr suitable for parsing data from streams that don't have EOF right after the text to parse?
According to my observation, the lexer does not emit the current token until the first character of next token is received.
On top of that - the parser seems not to emit the rule until the first token of next rule is received.
Here is a simple grammar I tried:

fox: 'quick' 'brown' 'fox' '\r'? '\n' ;

Then I used the generated parser with UnbufferedCharStream and UnbufferedTokenStream:

CharStream input = new UnbufferedCharStream(is);
MyLexer lex = new MyLexer(input);
lex.setTokenFactory(new CommonTokenFactory(true));
TokenStream tokens = new UnbufferedTokenStream(lex);
MyParser parser = new MyParser(tokens);
MyParser.FoxContext fox = parser.fox();

when the stream gets 'quick' - nothing happens
when 'b' comes in - entering rule 'fox'
then 'roun' - nothing (2 tokens are in the stream - none of them is known to leser yet!)
only after 'f' the listener visits the first token: 'quick'
then - nothing on 'ox'
on new line (unix): visit token 'brown'
Now the stream has all data (4 tokens), but only 2 tokens are recognized.
I found that in order to push those tokens through the system the stream can emit 2 tokens, that is any tokens known to the grammar.
It could be 2 extra new lines, or 'fox' and 'brown'.
Only then the tokens 'fox' and '\n' get visited, the parser exits rule 'fox' and parsing gets finished.

Is that a bug or a feature?
Is there a way to eliminate that lag?

Thanks!

Reply all
Reply to author
Forward
0 new messages