Visitor vs Listener for Tree-Based Interpreter

3,596 views
Skip to first unread message

Rick M

unread,
Dec 24, 2012, 12:16:09 AM12/24/12
to antlr-di...@googlegroups.com
I'm trying to build an interpreter for a Java-like language using ANTLR v4. I'm skimming through the v4 Reference as well as Language Implementation Patterns, which provides an excellent example. However, since Patterns was written before ANTLR v4, it talks only about Visitors, and not Listeners. I'm wondering if a Listener can be used for the execution phase, or if a Visitor is required (because, as I understand it, the Visitor can control the tree walk better).

Any advice would be much appreciated. Thanks!

Terence Parr

unread,
Dec 24, 2012, 2:04:14 PM12/24/12
to antlr-di...@googlegroups.com
Hi.I strongly advise that you do not use tree interpretation (visitor or listener) for building interpreters. A byte code interpreter will be much much easier to implement and understand.

personally I would use a listener to generate byte codes and then use one of the byte code interpreters from the patterns book. one of these days I will get around to publishing my Smalltalk compiler and byte code interpreter.

Ter
On Dec 23, 2012, at 9:16 PM, Rick M wrote:

> I'm trying to build an interpreter for a Java-like language using ANTLR v4. I'm skimming through the v4 Reference as well as Language Implementation Patterns, which provides an excellent example. However, since Patterns was written before ANTLR v4, it talks only about Visitors, and not Listeners. I'm wondering if a Listener can be used for the execution phase, or if a Visitor is required (because, as I understand it, the Visitor can control the tree walk better).
>
> Any advice would be much appreciated. Thanks!
>
> --
>
>

Rick Mann

unread,
Dec 24, 2012, 8:02:14 PM12/24/12
to antlr-di...@googlegroups.com

On Dec 24, 2012, at 11:04 , Terence Parr <pa...@cs.usfca.edu> wrote:

> Hi.I strongly advise that you do not use tree interpretation (visitor or listener) for building interpreters. A byte code interpreter will be much much easier to implement and understand.
>
> personally I would use a listener to generate byte codes and then use one of the byte code interpreters from the patterns book. one of these days I will get around to publishing my Smalltalk compiler and byte code interpreter.

Interesting! The Patterns book says "These [parser-based] high-level patterns are best suited to building DSLs rather than general-purpose programming languages. Usually, simplicity and low-cost implementation trump execution efficiency for DSLs." (p.232).

Ease of implementation is certainly what I was prioritizing in these early stages. As I went through the process, though, I it became evident that there was more waste than necessary, and I think, but I'm not sure, that v3's rewrite capabilities make that better than what can be done in v4.

It also occurs to me that compiling to bytecode is a form of tree rewrite, is it not?

In any case, before I take the plunge into implementing a bytecode compiler and interpreter, is it possible to clean up the AST in v4, both by eliminating nodes and by pre-computing result (like literals) and stashing them in the tree?

Thanks!

--
Rick



Sam Harwell

unread,
Dec 24, 2012, 8:14:51 PM12/24/12
to antlr-di...@googlegroups.com
If you really need to, you can use a listener to remove undesired nodes from the parse tree by removing elements from ParserRuleContext.children [see 1].

The ParseTreeProperty [see 2] class allows you to arbitrarily annotate nodes in the tree without having to alter the grammar to explicitly support the new properties.

[1]: http://antlr4.org/api/Java/org/antlr/v4/runtime/ParserRuleContext.html#children
[2]: http://antlr4.org/api/Java/org/antlr/v4/runtime/tree/ParseTreeProperty.html

--
Sam Harwell
Owner, Lead Developer
http://tunnelvisionlabs.com
--




Rick Mann

unread,
Dec 24, 2012, 8:48:22 PM12/24/12
to antlr-di...@googlegroups.com

On Dec 24, 2012, at 17:14 , Sam Harwell <s...@tunnelvisionlabs.com> wrote:

> If you really need to, you can use a listener to remove undesired nodes from the parse tree by removing elements from ParserRuleContext.children [see 1].
>
> The ParseTreeProperty [see 2] class allows you to arbitrarily annotate nodes in the tree without having to alter the grammar to explicitly support the new properties.
>
> [1]: http://antlr4.org/api/Java/org/antlr/v4/runtime/ParserRuleContext.html#children
> [2]: http://antlr4.org/api/Java/org/antlr/v4/runtime/tree/ParseTreeProperty.html

Thanks, Sam, that should do what I need. It may not be as concise as the AST rewrite rules, but should allow me to anything I need.

Terence is probably right that a bytecode interpreter is the better way to go, and after reading that section, I kinda want to try it, but I'll pursue this a bit further, especially since I already have simple method calls working (which is pretty satisfying and encouraging).

Thanks!

--
Rick



Rick Mann

unread,
Dec 25, 2012, 6:26:45 AM12/25/12
to antlr-di...@googlegroups.com

On Dec 24, 2012, at 17:14 , Sam Harwell <s...@tunnelvisionlabs.com> wrote:

> [2]: http://antlr4.org/api/Java/org/antlr/v4/runtime/tree/ParseTreeProperty.html

This actually proves to be rather cumbersome in use. I have a Listener and a Visitor, and I need to convey the information from one to the other. Sure, make one and pass it to each, but it seems more natural to just have a HashMap on ParseTree that we can just put values in, doesn't it?

--
Rick



Sam Harwell

unread,
Dec 25, 2012, 7:09:21 AM12/25/12
to antlr-di...@googlegroups.com
Personally I use something slightly different, but the idea to really work to preserve is having a type-safe way to associate documented values in the map. A Map<ParseTree, Properties> or similar is not maintainable in the long run. The shortest route may be to create a class containing a field for the ParseTree along with one or more ParseTreeProperty fields to contain the data you need for your listeners and visitors, and pass that around instead of just the ParseTree.

--
Sam Harwell
Owner, Lead Developer
http://tunnelvisionlabs.com

-----Original Message-----
From: antlr-di...@googlegroups.com [mailto:antlr-di...@googlegroups.com] On Behalf Of Rick Mann
Sent: Tuesday, December 25, 2012 5:27 AM
To: antlr-di...@googlegroups.com
Subject: Re: [antlr-discussion] Visitor vs Listener for Tree-Based Interpreter


--




Rick Mann

unread,
Dec 25, 2012, 7:20:30 AM12/25/12
to antlr-di...@googlegroups.com

On Dec 25, 2012, at 4:09 , Sam Harwell <s...@tunnelvisionlabs.com> wrote:

> Personally I use something slightly different, but the idea to really work to preserve is having a type-safe way to associate documented values in the map. A Map<ParseTree, Properties> or similar is not maintainable in the long run. The shortest route may be to create a class containing a field for the ParseTree along with one or more ParseTreeProperty fields to contain the data you need for your listeners and visitors, and pass that around instead of just the ParseTree.

That might work better, yeah. In the end I just made a ParseTreeProperty<Object> and passed it to both L&V. Seems to work. I'm able to cull a few nodes and pre-compute the literals.

--
Rick



Terence Parr

unread,
Dec 25, 2012, 12:10:57 PM12/25/12
to antlr-di...@googlegroups.com
isn't that what ParseTreeProperty is? It's a decoration to avoid a HashMap in each node just to store one or two values.

Ter

Rick Mann

unread,
Dec 25, 2012, 6:29:55 PM12/25/12
to antlr-di...@googlegroups.com

On Dec 25, 2012, at 9:10 , Terence Parr <pa...@cs.usfca.edu> wrote:

> isn't that what ParseTreeProperty is? It's a decoration to avoid a HashMap in each node just to store one or two values.

No, I mean literally a map on the node, so I don't have to pass it around externally. It's okay, though, I made it work. But I lose the type-safetiness it offers because I don't want a separate one for each type I might put on a node.

For example, in my listener's enterLiteralExpr(), I check to see what kind of literal it is, create an appropriate object (Integer, Float, String, Boolean, etc.), and stick it on the literal node. I use only a single ParseTreeProperty for "constLitVal". But I have to expose that ParseTreeProperty decorator out of both the Listener and subsequent Visitor that I use for execution.

It would be easier to just call ParseTree.setProperty(String,Object) and ParseTree.getProperty(String) when a node gets visited.

It's okay, it's working as is now, and I'm getting closer and closer to pulling the trigger on using LLVM as my bytecode interpreter (If I can just get the Java LLVM binding to build, or…a C++ parser back end ;-) ).

--
Rick



Terence Parr

unread,
Dec 26, 2012, 12:21:50 PM12/26/12
to antlr-di...@googlegroups.com

On Dec 25, 2012, at 3:29 PM, Rick Mann wrote:

>
> On Dec 25, 2012, at 9:10 , Terence Parr <pa...@cs.usfca.edu> wrote:
>
>> isn't that what ParseTreeProperty is? It's a decoration to avoid a HashMap in each node just to store one or two values.
>
> No, I mean literally a map on the node, so I don't have to pass it around externally.

that's what i meant :) "avoids a HashMap in each node just to store one or two values." note the trick:

rule[int x] : … ;

puts "int x" in rule's node.

> It's okay, though, I made it work. But I lose the type-safetiness it offers because I don't want a separate one for each type I might put on a node.
>
> For example, in my listener's enterLiteralExpr(), I check to see what kind of literal it is, create an appropriate object (Integer, Float, String, Boolean, etc.), and stick it on the literal node. I use only a single ParseTreeProperty for "constLitVal". But I have to expose that ParseTreeProperty decorator out of both the Listener and subsequent Visitor that I use for execution.
>
> It would be easier to just call ParseTree.setProperty(String,Object) and ParseTree.getProperty(String) when a node gets visited.

yeah, we didn't like lack of types and cost per node. imagine just an extra null ptr in a 1 million node tree. 8M more for people that don't want decorations.

> It's okay, it's working as is now, and I'm getting closer and closer to pulling the trigger on using LLVM as my bytecode interpreter (If I can just get the Java LLVM binding to build, or…a C++ parser back end ;-) ).

cool

Ter

Rick Mann

unread,
Dec 26, 2012, 10:25:39 PM12/26/12
to antlr-di...@googlegroups.com

On Dec 26, 2012, at 9:21 , Terence Parr <pa...@cs.usfca.edu> wrote:

>
> On Dec 25, 2012, at 3:29 PM, Rick Mann wrote:
>
>>
>> On Dec 25, 2012, at 9:10 , Terence Parr <pa...@cs.usfca.edu> wrote:
>>
>>> isn't that what ParseTreeProperty is? It's a decoration to avoid a HashMap in each node just to store one or two values.
>>
>> No, I mean literally a map on the node, so I don't have to pass it around externally.
>
> that's what i meant :) "avoids a HashMap in each node just to store one or two values." note the trick:

haha, yes, I got that the first time.

> rule[int x] : … ;
>
> puts "int x" in rule's node.

Oh, that might be useful.

> yeah, we didn't like lack of types and cost per node. imagine just an extra null ptr in a 1 million node tree. 8M more for people that don't want decorations.

Fair enough. What about something on Parser like:

setNodeProperty(ParseTree inNode, String inName, T inValue);
T getNodeProperty(ParseTree inNode, String inName, Class inValueClass)

I'm not sure of the exact syntax to genericize those.

>> It's okay, it's working as is now, and I'm getting closer and closer to pulling the trigger on using LLVM as my bytecode interpreter (If I can just get the Java LLVM binding to build, or…a C++ parser back end ;-) ).
>
> cool

LLVM is *really* cool. If I could get Xcode to link its libs into my JNI lib without bitching about STL link errors, it would be.

--
Rick



Terence Parr

unread,
Dec 28, 2012, 3:37:18 PM12/28/12
to antlr-di...@googlegroups.com
>
>> yeah, we didn't like lack of types and cost per node. imagine just an extra null ptr in a 1 million node tree. 8M more for people that don't want decorations.
>
> Fair enough. What about something on Parser like:
>
> setNodeProperty(ParseTree inNode, String inName, T inValue);
> T getNodeProperty(ParseTree inNode, String inName, Class inValueClass)
>
> I'm not sure of the exact syntax to genericize those.

Sure. you can do that but it's more or less the same thing.
>
>>> It's okay, it's working as is now, and I'm getting closer and closer to pulling the trigger on using LLVM as my bytecode interpreter (If I can just get the Java LLVM binding to build, or…a C++ parser back end ;-) ).
>>
>> cool
>
> LLVM is *really* cool. If I could get Xcode to link its libs into my JNI lib without bitching about STL link errors, it would be.

LLVM is awesome.

Ter

Jim Idle

unread,
Dec 28, 2012, 3:40:39 PM12/28/12
to antlr-di...@googlegroups.com
If LLVM gets too awkward, you can always use ASM and generate JVM byte
code. That is almost trivial. Means you are relying on the Java VM of
course. LLVM is great.

Jim


> -----Original Message-----
> From: antlr-di...@googlegroups.com [mailto:antlr-
> discu...@googlegroups.com] On Behalf Of Terence Parr
> Sent: Friday, December 28, 2012 12:37 PM
> To: antlr-di...@googlegroups.com
> Subject: Re: [antlr-discussion] Visitor vs Listener for Tree-Based
> Interpreter
>
> >
> >> yeah, we didn't like lack of types and cost per node. imagine just
> an extra null ptr in a 1 million node tree. 8M more for people that
> don't want decorations.
> >
> > Fair enough. What about something on Parser like:
> >
> > setNodeProperty(ParseTree inNode, String inName, T inValue);
> > T getNodeProperty(ParseTree inNode, String inName, Class
> inValueClass)
> >
> > I'm not sure of the exact syntax to genericize those.
>
> Sure. you can do that but it's more or less the same thing.
> >
> >>> It's okay, it's working as is now, and I'm getting closer and
> closer to pulling the trigger on using LLVM as my bytecode interpreter
> (If I can just get the Java LLVM binding to build, or.a C++ parser back
> end ;-) ).
> >>
> >> cool
> >
> > LLVM is *really* cool. If I could get Xcode to link its libs into my
> JNI lib without bitching about STL link errors, it would be.
>
> LLVM is awesome.
>
> Ter
>
> --
>

Rick Mann

unread,
Dec 28, 2012, 6:54:45 PM12/28/12
to antlr-di...@googlegroups.com

On Dec 28, 2012, at 12:37 , Terence Parr <pa...@cs.usfca.edu> wrote:

> Sure. you can do that but it's more or less the same thing.

But it make my code tidier. I don't have to manage a bunch of ParseTreeProperty objects. :-)

> LLVM is awesome.

It really is. I played with it a while back, always wanted to use it.

--
Rick



Rick Mann

unread,
Dec 28, 2012, 6:58:54 PM12/28/12
to antlr-di...@googlegroups.com

On Dec 28, 2012, at 12:40 , Jim Idle <ji...@temporal-wave.com> wrote:

> If LLVM gets too awkward, you can always use ASM and generate JVM byte
> code. That is almost trivial. Means you are relying on the Java VM of
> course. LLVM is great.

It turns out the JVM runs on my target hardware (a small embedded Linux device), but it would be good build something that doesn't require Java. Plus, on OS X at least, I can't find the proper libjvm.dylib to resolve the JNI_CreateJavaVM() call.

On the other hand, I can't decide which is easier: building the JNI bridge to LLVM, or building the v4 C++ back-end. I'd say the former, but I've got a ridiculous STL link error that shouldn't be, and no one seems to be able to help me resolve it.

So, I'm stuck.

Are you referring to this? http://asm.ow2.org

--
Rick



Jim Idle

unread,
Dec 28, 2012, 7:05:08 PM12/28/12
to antlr-di...@googlegroups.com
Yes - that is the framework - very good also.

I have been travelling so I may have missed your issue, but have you tried
just something simple like this first:

http://mrjoelkemp.com/2012/01/getting-started-with-jni-and-c-on-osx-lion/


Rather than start the C binary, you could start a small Java stub that
called your ObjC stuff.

But are you linking with -framework javaVm


Jim


> -----Original Message-----
> From: antlr-di...@googlegroups.com [mailto:antlr-
> discu...@googlegroups.com] On Behalf Of Rick Mann
> Sent: Friday, December 28, 2012 3:59 PM
> To: antlr-di...@googlegroups.com
> Subject: Re: [antlr-discussion] Visitor vs Listener for Tree-Based
> Interpreter
>
>
> --
>

Rick Mann

unread,
Dec 28, 2012, 7:32:33 PM12/28/12
to antlr-di...@googlegroups.com

On Dec 28, 2012, at 16:05 , Jim Idle <ji...@temporal-wave.com> wrote:

> I have been travelling so I may have missed your issue, but have you tried
> just something simple like this first:
>
> http://mrjoelkemp.com/2012/01/getting-started-with-jni-and-c-on-osx-lion/
>
>
> Rather than start the C binary, you could start a small Java stub that
> called your ObjC stuff.

No, I can make a JNI call without problem (by the way, nowadays, JNI libs are .dylib, not .jnilib). I have two other problems. One has nothing to do with LLVM, and has to do with embedding the JVM (allowing my process to have a JVM run inside it). I can't find a library to link for the create call.

> But are you linking with -framework javaVm

Beginning with JDK 7 on OS X 10.8, Oracle provides the JVM, and AFAIK, the JavaVM.framework is not to be used. But that's a problem for later, and only if I need to stick with Java.

The problem I'm having now is that when I like my JNI dylib against libLLVMCore.a, Xcode complains about being unable to link a TON of STL calls. Now, this doesn't really make sense to me, as STL methods typically get instantiated at compile time.

No one in the LLVM community or the Xcode mailing list has been able to give me the right incantation of clang options to get LLVM to link inside my Xcode project. Note that if I call clang++ directly on a trivial test app, I can link LLVM without issue. I'm boggled.

If you're excessively curious, I've posted a couple of comparisons of the two build approaches online:

http://pastebin.com/n9nCat15
http://pastebin.com/kufBknaw

--
Rick



Reply all
Reply to author
Forward
0 new messages