Why is there no performance improvement when I just do recognition (don't build a parse tree)?

Roger Costello

unread,

Jan 9, 2015, 9:23:56 AM1/9/15

to antlr-di...@googlegroups.com

Hi Folks,

On page 47 of the ANLTR book it describes how to create a language recognizer; that is, how to avoid the overhead of building a parse tree. Here is the key to creating a language recognizer:

parser.setBuildParseTree(false); // don't waste time building a tree

So I created a grammar and fed it a large input string. I created just a language recognizer, using the above line of code (i.e., I didn't build a parse tree). I recorded the time that it took to process the input. Then I did the same thing, but this time I built the parse tree. I again recorded the time. The times were the same! That is, the time needed to simply recognize the input is the same as the time needed to create a parse tree.

I am surprised.

The ANTRL book seems to say that language recognition will be significantly faster than building a parse tree.

Would someone explain why I am getting no performance improvement with language recognition please?

Thanks!

/Roger

Terence Parr

unread,

Jan 9, 2015, 5:19:24 PM1/9/15

to antlr-di...@googlegroups.com

the gain is modest but usually there for big input. for small you won’t notice.
Ter

> --
> You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Martin Traverso

unread,

Jan 14, 2015, 6:12:32 PM1/14/15

to antlr-di...@googlegroups.com

So I created a grammar and fed it a large input string. I created just a language recognizer, using the above line of code (i.e., I didn't build a parse tree). I recorded the time that it took to process the input. Then I did the same thing, but this time I built the parse tree. I again recorded the time. The times were the same! That is, the time needed to simply recognize the input is the same as the time needed to create a parse tree.

You need to be very careful when benchmarking Java programs. The JVM is a complex system and certain optimizations only kick in after methods are invoked thousands of times. There are a bunch of other caveats you should be aware of [1]. I'd highly recommend using JMH [2] for all your Java micro-benchmarking needs.

Just for kicks, I tried did a quick experiment with my Antlr-based SQL parser, and this is the result I got with JMH (higher is better):

Benchmark Score Error Units

build-tree 32514.349 ± 446.803 ops/s

no-tree 50281.666 ± 990.778 ops/s

Martin

[1] http://shipilev.net/talks/jvmls-July2013-benchmarking.pdf

[2] http://openjdk.java.net/projects/code-tools/jmh/

Reply all

Reply to author

Forward