Need help with speeding up ANTLR4 parser

548 views
Skip to first unread message

Pavel Velikhov

unread,
Jun 9, 2016, 4:09:31 PM6/9/16
to antlr-discussion
Hi!

  We've extended Python3 grammar (from the ANTLR grammar repository) with some query language capabilities and
build a preprocessor with a runtime. All looks great, except the parsing time is very very slow (multiple seconds on small
programs). Don't really have a clue on how to optimise this: the grammar looks decent, I did put a bunch of lexical items
in the new rules, but when I moved them to lexer, the speedup was negligeble. I suppose the trouble is with the oriniginal
grammar or the way it was combined with the new one. But don't have any plan of attack to fix this.

  A bunch of details with the profile trace are on the SO page: http://stackoverflow.com/questions/37623242/antlr4-very-slow-the-sll-trick-didnt-change-anything

Would appreciate any help!
Thanks and best regrards,
Pavel Velikhov

Eric Vergnaud

unread,
Jun 9, 2016, 9:06:06 PM6/9/16
to antlr-discussion
Hi,
first you need to check your antlr runtime version, there was a major performance improvement in the last version.
second, you should try the same grammar in Java/IntelliJ which comes with an antlr profiler
the provided grammar samples are correct but not optimized
Eric

Pavel Velikhov

unread,
Jun 10, 2016, 6:16:59 AM6/10/16
to antlr-discussion


On Friday, 10 June 2016 04:06:06 UTC+3, Eric Vergnaud wrote:
Hi,
first you need to check your antlr runtime version, there was a major performance improvement in the last version.
second, you should try the same grammar in Java/IntelliJ which comes with an antlr profiler
the provided grammar samples are correct but not optimized
Eric

Thanks Eric. I have the latest Python runtime. In case of Python language, the grammar is a bit target-language dependent
(all the identation stuff), but will try to use the antlr profiler.  

Eric Vergnaud

unread,
Jun 10, 2016, 12:11:03 PM6/10/16
to antlr-discussion
So if I understand you correctly, you have embedded Python productions in the grammar.
I suggest you remove them completely to track down performance issues efficiently.
There is a possibility that the issue actually originates from this embedded Python code.

Pavel Velikhov

unread,
Oct 20, 2016, 9:23:53 AM10/20/16
to antlr-discussion
Didn't notice this thread.

I actually went ahead and removed all productions from the grammar, and then I removed all my changes to the Python3 grammar that I started
with. That grammar took as much time to parse as well, and it takes milliseconds to parse with the Java target.

My conclusion is that I can't hope to speed things up with the Python target. In the long term I was planning to write a parser by hand (for better error reporting), bit seems now that something must be done in the short term too, so currently I'm thinking about PLY.
Reply all
Reply to author
Forward
0 new messages