Upgrading from 4.6 to 4.7 triggers stackoverflow in existing project.

40 views
Skip to first unread message

Niels Basjes

unread,
Jul 11, 2017, 4:54:12 AM7/11/17
to antlr-discussion
Hi,

I use Antlr for parsing useragent strings in this project: 

Yesterday I noticed there is a newer version (4.7) than the one I was using (4.6) so I simply updated the dependencies and tried to run my tests.
Just about everything I have failed with a stackoverflow.

I reduced the code to reproduce the problem to the smallest code fragment I could think of


I would really appreciate any pointers to figure out what I have done wrong.

Thanks.

Niels Basjes

Niels Basjes

unread,
Jul 11, 2017, 5:31:44 AM7/11/17
to antlr-discussion
I have been able to reduce the reproduction even further.
It seems that if two patterns in the Lexer can match then this problem occurs.

Niels Basjes


Op dinsdag 11 juli 2017 10:54:12 UTC+2 schreef Niels Basjes:

Eric Vergnaud

unread,
Jul 12, 2017, 2:05:16 PM7/12/17
to antlr-discussion
I'm not sure I understand the below in your grammar:

WORD : [a-zA-Z0-9-+]+

What is the purpose of the + inside the set ? 

Niels Basjes

unread,
Jul 14, 2017, 1:31:26 AM7/14/17
to antlr-discussion
Hi,

The purpose of the + is that a string like "Foo-Bar+Baz" is matched.

Turns out that the problem is that the '-' may not appear 'in the middle' of such a set and the stack overflow is a bug in Antlr.

Niels Basjes

Op woensdag 12 juli 2017 20:05:16 UTC+2 schreef Eric Vergnaud:

Eric Vergnaud

unread,
Jul 14, 2017, 1:09:33 PM7/14/17
to antlr-discussion
I think you need to escape both the - and the +

Ivan Kochurkin

unread,
Jul 14, 2017, 5:55:14 PM7/14/17
to antlr-discussion
You are wrong. The '+' should not be escaped as it is not a special char set symbol (']', '-').

Eric Vergnaud

unread,
Jul 19, 2017, 10:26:27 AM7/19/17
to antlr-discussion
Thought it could be interpreted as a repeat
Reply all
Reply to author
Forward
0 new messages