Reactive parser using state machine

59 views
Skip to first unread message

Petr Gladkikh

unread,
Apr 4, 2017, 1:01:57 PM4/4/17
to antlr-discussion
Hello,

I am trying to change some things in Actson JSON parser (https://github.com/michel-kraemer/actson) which is itself based on JSON_checker (http://www.json.org/JSON_checker/).
It is reactive parser which means that is parses next available piece of input data as long as it's available and maybe synchronously issues a parsed JSON event(s). This means that parser never blocks current thread waiting for new piece of input text.

Looking at code generated by Antlr I see that it uses some loops to pull data from input stream which apparently should block if subsequent yet unparsed part of input is not available. So this is not suitable if you want to release current thread while waiting for new data.

Looking at Actson's code (https://github.com/michel-kraemer/actson/blob/master/src/main/java/de/undercouch/actson/JsonParser.java) you'll see that core of parser is hand-crafted transition table with lots of complicated supporting logic. This code is rather hard to change so I am looking at ways to maybe generate the parser from syntax declaration. 

So the question is, can I use state machine generated by Antlr to somehow implement reactive parser? That is a parser that gets called with new piece of input stream and returns new parser state and maybe parse events (instead of calling input stream from parser as it is currently implemented in Antlr and most other parsers). 

Y2i

unread,
Apr 5, 2017, 11:59:22 AM4/5/17
to antlr-discussion
Because the Antlr4-generated parser is an adaptive LL(*) parser, depending on the given grammar and the input, it may need to read the whole input before returning any answer.  By default parsers generated by Antlr4 buffer the whole stream.  There is a way to change this by configuring the parser with UnbufferedCharStream, but I don't think it can be used to read input reactively. From JavaDoc comment:

"Unbuffered" here refers to fact that it doesn't buffer all data, not that's it's on demand loading of char.

There is nextChar() method that allows changing the source of characters from InputStream, but it seems to be called in synchronous manner.
I could be wrong, may be there is a way to configure it to read reactively, but I doubt that.
Reply all
Reply to author
Forward
0 new messages