How to diagnose infinite (or just veeeeeery long) loop when parsing

Marco Mastropaolo

unread,

Oct 3, 2014, 5:28:06 AM10/3/14

to antlr-di...@googlegroups.com

Hi everyone, I'm Marco Mastropaolo, the author of a Lua interpreter and am using ANTLR4.3 for parsing (in C#) - and it's the first time I navigate in this group.

I have a grammar I'm using and an evil script file here: https://www.dropbox.com/sh/z19czj6nttw1ou8/AABpwaQTY3f7t6RwlgTHchXQa?dl=0

If I try to parse that file (which is actually a version reduced to the minimum which presents the problem) it seems stuck forever.
I tried using the Java version and then doing:

grun Lua chunk bugged.lua -trace -diagnostics

and it prints

enter   chunk, LT(1)=onInsertItem
enter   block, LT(1)=onInsertItem
enter   stat, LT(1)=onInsertItem
enter   varlist, LT(1)=onInsertItem
enter   var, LT(1)=onInsertItem
consume [@0,2:13='onInsertItem',<50>,2:0] rule var
exit    var, LT(1)==
exit    varlist, LT(1)==
consume [@1,15:15='=',<5>,2:13] rule stat
enter   explist, LT(1)=function
enter   exp, LT(1)=function
enter   anonfunctiondef, LT(1)=function
consume [@2,17:24='function',<17>,2:15] rule anonfunctiondef
enter   funcbody, LT(1)=(
consume [@3,25:25='(',<8>,2:23] rule funcbody
enter   parlist, LT(1)=self
enter   namelist, LT(1)=self
consume [@4,26:29='self',<50>,2:24] rule namelist
consume [@5,30:30=',',<10>,2:28] rule namelist
consume [@6,32:35='item',<50>,2:30] rule namelist
exit    namelist, LT(1)=)
exit    parlist, LT(1)=)
consume [@7,36:36=')',<45>,2:34] rule funcbody
enter   block, LT(1)=return
enter   retstat, LT(1)=return
consume [@8,40:45='return',<23>,3:1] rule retstat
enter   explist, LT(1)=(
enter   exp, LT(1)=(
enter   prefixexp, LT(1)=(
enter   varOrExp, LT(1)=(

It happens with both C# and Java versions and I tried with 4.2.2, 4.3 and 4.4.

Now I don't know how to go further on my diagnostics - I think it's likely to be a bug in the grammar having some ambiguous declaration causing exponential times.. but I don't really know how to go on.

Any help is really appreciated, thanks!!

-- marco

Marco Mastropaolo

unread,

Oct 3, 2014, 6:00:43 AM10/3/14

to antlr-di...@googlegroups.com

Bence Erős

unread,

Oct 3, 2014, 6:06:10 AM10/3/14

to antlr-di...@googlegroups.com

Hi,

this might help: https://theantlrguy.atlassian.net/wiki/pages/viewpage.action?pageId=1900591

--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Bence Erős
CyclonePHP core developer

http://erosbence.blogspot.com

Marco Mastropaolo

unread,

Oct 3, 2014, 6:18:36 AM10/3/14

to antlr-di...@googlegroups.com

Hi,
thanks for the reply.
I'm already doing two stage parsing in my application, and the parser doesn't even throw the exception - it hangs at the SLL stage.

Marco Mastropaolo

unread,

Oct 3, 2014, 8:58:33 AM10/3/14

to antlr-di...@googlegroups.com

I had moderate success replacing the above exp definition with:

expterm

    ;

powerExp : expterm | <assoc=right> expterm operatorPower powerExp;
unaryExp : powerExp | operatorUnary unaryExp;
muldivExp : unaryExp | unaryExp operatorMulDivMod muldivExp;
addsubExp : muldivExp | muldivExp operatorAddSub addsubExp;
strcatExp : addsubExp | <assoc=right> addsubExp operatorStrcat strcatExp;
compareExp : strcatExp | strcatExp operatorComparison compareExp;
logicAndExp : compareExp | compareExp operatorAnd logicAndExp;
exp : logicAndExp | logicAndExp operatorOr exp;

which is written "old-style". The only side-effect is that this gives very deep ASTs for me to parse - nothing terrible, but I'll have to work around it to avoid spending tons of time navigating dummy AST nodes.

A question still remains.. is this to be considered a bug in ANTLR4 ? After all the previous grammar had not ambiguities (at least, none reported by diagnostics), the script was not too complex and yet the left-recursion transformations didn't quite workout the correct way [btw - it's not an infinite loop but it seems every "or.. and.." of that evil scripts increments processing time exponentially].

Thanks
Marco

Terence Parr

unread,

Oct 3, 2014, 10:22:50 AM10/3/14

to antlr-di...@googlegroups.com

Strange. We earlier had an issue with this but Sam fixed a performance problem.

It’s likely now that you have a full-context LL decision in your grammar that sees into or over expr. ALL(*) w/o DFA caching is very slow and full LL kills that. Use the intellij ANTLR plugin profiler to find the LL vs SLL decisions.

Ter

Marco Mastropaolo

unread,

Oct 3, 2014, 10:52:11 AM10/3/14

to antlr-di...@googlegroups.com

Hi thanks for the response, I'll try IntelliJ profiler.

But, if I had a full-context LL decision, wouldn't the parser throw an exception when run in SLL mode ?

What I find odd is that it takes that long time even in SLL..

Terence Parr

unread,

Oct 3, 2014, 11:06:10 AM10/3/14

to antlr-di...@googlegroups.com

If it is slow in pure SLL, then it’s probably a bug

T

Marco Mastropaolo

unread,

Oct 3, 2014, 1:16:15 PM10/3/14

to antlr-di...@googlegroups.com

I run the profiler.

Honestly I don't find its output very useful - so I took a screenshot in case anyone has an idea ..

https://www.dropbox.com/s/7p421x9kph69kxb/profiler.JPG?dl=0

Terence Parr

unread,

Oct 3, 2014, 2:49:13 PM10/3/14

to antlr-di...@googlegroups.com

hmm…doesn’t look too bad but i don’t think you sorted it by time or anything. says 99% spent in prediction so something is up. see if full context column has any elements. send in smallest grammar that exhibits problem with sample input?

Ter

Sam Harwell

unread,

Oct 3, 2014, 9:07:10 PM10/3/14

to antlr-di...@googlegroups.com

Thanks for the report. I’ll take a look this weekend.

It would help if you could open an issue on the issue tracker and include the grammar and sample input, so we can keep track of the final resolution in case other people face similar issues in the future.

https://github.com/antlr/antlr4/issues

Thank you,

Sam Harwell

Marco Mastropaolo

unread,

Oct 4, 2014, 5:01:24 PM10/4/14

to antlr-di...@googlegroups.com

Thanks to you. I just posted it on github issues, sorry for not being able to do it before, but I was out the whole day.

I still have some suspects it might be caused by something wrong in the grammar, but honestly I don't know how to go on with my resources (I'm no expert in this field) and I can't see the numbers in the profiler being of particular help.. at least not to me.

Thanks again,
Marco

Reply all

Reply to author

Forward