Antlr4 debugging

1,881 views
Skip to first unread message

beeble...@gmail.com

unread,
Apr 23, 2013, 1:10:25 AM4/23/13
to antlr-di...@googlegroups.com
Could I get advice on the best ways to debug antlr4 parser grammar's effectively? The definitive reference and wiki doesn't have too much information on the subject.

I was looking at screen shots of antlrworks and it showed a lot of neat debugging tools that would graph ambiguities and debug through the parsing steps graphically. This is exactly what I'd like to be able to do in antlrworks2. Are there similar hidden gems in antlrworks2?

The decision tree graph that highlight ambiguities would be amazing, but how do I get the generated ATN graph Mr. Harwell show's on the issue tracker (https://f.cloud.github.com/assets/1408396/398668/e19e283c-a85c-11e2-8dac-0d29979906d9.png)

I have been able to remote debug into the test rig but stepping through it has not been very beneficial at trying to track down ambiguities. The output about ambiguities isn't helpful either. Even though it shows d=<N> and then the alternatives I haven't been able to figure out how to map <N> to rules in larger parser and it appears that the ambiguity numbers don't always match up to same alternatives depicted in the grammar. My guess is that they are matching the unrolled recursive grammar which makes the diagnostic output unhelpful.

One thing I would like to see the test rig do is something similar to --trace. I'd like to see the trace entering and exiting EVERY parser rule so I can track decision branch flow to better see when the grammar isn't working how I expect it to. I think that would have helped out a bunch when I spent the last couple of days tracking down an issue with the grammar that turned out to be an antlr decision bug. The only way I found out it was a bug was by updating antlr4 to the latest git commit and now it all magically works.

Another thing I would like to see is the ability to use the lexer debugger in antlrworks2 against the output from test rig's tokens option. The language I'm generating unfortunately requires several predicates in the lexer so the tokens shown with the lexer debugger does not match the tokes generated by the actual lexer in production.

Anyway, I can't wait to see the next revision of antlrworks2 and more information on how to effectively debug antlr4 grammars!

Thanks,
James

Terence Parr

unread,
Apr 23, 2013, 2:57:21 PM4/23/13
to antlr-di...@googlegroups.com
Hi James, excellent questions. First, let me draw your attention to the -atn  option on the other commandline tool, which will generate the graph you show.  Also, you can get a trace by asking a parser to collect and print a bunch of information:

/** During a parse is sometimes useful to listen in on the rule entry and exit
*  events as well as token matches. This is for quick and dirty debugging.
*/
public void setTrace(boolean trace) {...}

I think this will get you where you need to go.
Ter



--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.





--
Dictation in use. Please excuse homophones, malapropisms, and nonsense. 

Sam Harwell

unread,
Apr 23, 2013, 3:15:52 PM4/23/13
to antlr-di...@googlegroups.com

Note that setTrace uses addParseListener, and is therefore subject to the limitations described in that method. Here is a copy of the latest documentation describing the limitations of a parse listener.

 

                /**

                * Registers {@code listener} to receive events during the parsing process.

                * <p/>

                * To support output-preserving grammar transformations (including but not

                * limited to left-recursion removal, automated left-factoring, and

                * optimized code generation), calls to listener methods during the parse

                * may differ substantially from calls made by

                * {@link ParseTreeWalker#DEFAULT} used after the parse is complete. In

                * particular, rule entry and exit events may occur in a different order

                * during the parse than after the parser. In addition, calls to certain

                * rule entry methods may be omitted.

                * <p/>

                * With the following specific exceptions, calls to listener events are

                * <em>deterministic</em>, i.e. for identical input the calls to listener

                * methods will be the same.

                *

                * <ul>

                * <li>Alterations to the grammar used to generate code may change the

                * behavior of the listener calls.</li>

                * <li>Alterations to the command line options passed to ANTLR 4 when

                * generating the parser may change the behavior of the listener calls.</li>

                * <li>Changing the version of the ANTLR Tool used to generate the parser

                * may change the behavior of the listener calls.</li>

                * </ul>

                *

                * @param listener the listener to add

                *

                * @throws NullPointerException if {@code} listener is {@code null}

                */

                public void addParseListener(@NotNull ParseTreeListener listener) {

 

Thank you,

--

Sam Harwell

Owner, Lead Developer

http://tunnelvisionlabs.com

beeble...@gmail.com

unread,
Apr 23, 2013, 9:16:20 PM4/23/13
to antlr-di...@googlegroups.com
Thanks for the response Ter,

The setTrace() option does sound useful I'll have to research a bit more how to utilize it easily in these early stages of development. Currently I have been simply using the test rig -trace option but it seems like I need to create my own driver program to enable that level of tracing?

I thought the -atn was broken on the generator because Graphviz 2.3 could not read them. It turns out that the generated .dot files has lines like:

s730:p0 -> s728 [fontname="Times-Italic", label="&epsilon;"];

Notice the :p0 appended to the first field. Graphviz and other visual tools that support the .dot files don't like the :p0. If I manually remove them than I get a graph, but I don't know what the intended result of :p0 is so I am unsure if the results are an accurate representation of the atn?

Thanks again,
James

beeble...@gmail.com

unread,
Apr 23, 2013, 9:24:35 PM4/23/13
to antlr-di...@googlegroups.com
Thanks Sam,

Those 'limitations' sound like exactly what I need to see in my trace to determine what is really going on. If I use the -Xlog option to see the transformed grammar I can use that to more easily follow the trace statements given, correct? A nice feature would be a program that would just output a transformed .g4 file as part of the generation process.

~James

Sam Harwell

unread,
Apr 23, 2013, 11:10:46 PM4/23/13
to antlr-di...@googlegroups.com
The :p0 specifies a port. You can see similar syntax many places in this example:
http://www.graphviz.org/content/datastruct

You might try updating your copy of GraphViz to 2.28 or newer. You can also use the following extension to more easily visualize the results:
http://blog.280z28.org/archives/2013/01/109/

Thank you,
--
Sam Harwell
Owner, Lead Developer
http://tunnelvisionlabs.com

-----Original Message-----
From: antlr-di...@googlegroups.com [mailto:antlr-di...@googlegroups.com] On Behalf Of beeble...@gmail.com
Sent: Tuesday, April 23, 2013 8:16 PM
To: antlr-di...@googlegroups.com
Subject: Re: [antlr-discussion] Antlr4 debugging

James Hart

unread,
Apr 24, 2013, 12:03:41 AM4/24/13
to antlr-di...@googlegroups.com
Thanks again. I apologize, I have been able to confirm it isn't an issue with the .dot files.

After your response I played around some more in Graphviz (version 2.30.1) and found the :p0 just makes what appears to be a bug in Graphviz occur more frequent.  If I just repeatedly click the 'layout' button it will randomly produce the graph and other times claim there is a syntax issue in the .dot file.  There is another issue with a .dot editor I use in eclipse that shows the :p0 as a syntax error, fun. Thanks for the link to your extension I'll have to try it out.

Jim Idle

unread,
Apr 24, 2013, 12:18:29 AM4/24/13
to antlr-di...@googlegroups.com
For Eclipse, try the TextUML plugin from the Eclipse Marketplace.

Jim
Reply all
Reply to author
Forward
0 new messages