LL(k) analysis for ANTLR 4 grammars?

Nikolay Ognyanov

nepročitano,

24. tra 2018. 11:36:5124. 04. 2018.

u antlr-discussion

The power of ANTLR 4 is impressive but also - scary. In the process of grammar development it is easy to overlook inefficiencies and outright bugs. Therefore I have a question : is anybody aware of tool(s) for (linearized) LL(k) analysis of ANTLR 4 grammars? I use occasionally for this purpose AntlrWorks 1.x with some success. That however implies conversion of the grammar to ANTLR 3 which can be tedious and error prone.

In case that ANTLR 4 authors are listening here is a question : Would you consider adding (linearized) LL(k) analysis for the purpose of grammar development support ?

Mike Lischke

nepročitano,

25. tra 2018. 05:12:5025. 04. 2018.

u antlr-di...@googlegroups.com

Hi Nikolay,

> The power of ANTLR 4 is impressive but also - scary. In the process of grammar development it is easy to overlook inefficiencies and outright bugs. Therefore I have a question : is anybody aware of tool(s) for (linearized) LL(k) analysis of ANTLR 4 grammars? I use occasionally for this purpose AntlrWorks 1.x with some success. That however implies conversion of the grammar to ANTLR 3 which can be tedious and error prone.
>
> In case that ANTLR 4 authors are listening here is a question : Would you consider adding (linearized) LL(k) analysis for the purpose of grammar development support ?

Well, I haven’t studied compiler design (but instead applied informatics), so I’m rather a compiler hobbyist. But if you can explain what you would need I might be interested in adding such features to my ANTLR4 extension for Visual Studio code. Of course I cannot add anything to ANTLR4 itself and I doubt, even with a complete patch provided by the community, that anything like that is being added anytime soon. Consequently, only what’s currently there can be used for such analysis.

Mike
--
www.soft-gems.net

Nikolay Ognyanov

nepročitano,

25. tra 2018. 22:50:1925. 04. 2018.

u antlr-discussion

What I mean by "LL(k) analysis" is primarily computation of FirstK and FollowK sets and discovery of conflicts based on those. The basic theory is available in essentially every compiler design textbook. See e.g. https://www.amazon.com/Compiler-Design-Syntactic-Semantic-Analysis/dp/3642175392/ref=sr_1_fkmr0_3?ie=UTF8&qid=1524709854&sr=8-3-fkmr0&keywords=Reinhard+Wilhelm++compiler+design+syntactic+ans+semantic+analysis . And what I mean by "linearized LL(k)" is a simplified version of FirstK/FollowK which deals with strings of terminal sets rather than set of terminal strings. A set at position i in such string contains all terminals which can occur at position i in some string in the correspondent FirstK/FollowK set.

Regards
Nikolay

Mike Lischke

nepročitano,

26. tra 2018. 03:37:1126. 04. 2018.

u antlr-di...@googlegroups.com

What I mean by "LL(k) analysis" is primarily computation of FirstK and FollowK sets and discovery of conflicts based on those.

However, ANTLR4 uses a dynamic lookahead strategy (varying k), which makes it difficult (if not impossible) to compute individual follow k sets. Sam and Ter should be able to give a more detailed comment on the analysis part here.

And what I mean by "linearized LL(k)" is a simplified version of FirstK/FollowK which deals with strings of terminal sets rather than set of terminal strings. A set at position i in such string contains all terminals which can occur at position i in some string in the correspondent FirstK/FollowK set.

There’s a class LL1Analyzer (https://github.com/antlr/antlr4/blob/master/runtime/Java/src/org/antlr/v4/runtime/atn/LL1Analyzer.java) which computes the lookahead set for each transition of a given ATN state. Is that something that can help you?

Mike
--
www.soft-gems.net

Nikolay Ognyanov

nepročitano,

26. tra 2018. 08:13:5926. 04. 2018.

u antlr-discussion

Thank you for the reply. I am aware that LL(k) analysis is not relevant to ANTLR 4 parsing strategy. It would be good to have it just as analysis which is not related to code generation - to be used as a helper in the process of grammar development.

The class LL1Analyzer does not appear to be directly usable for the purposes I describe. Furthermore it only works for k=1.

Regards
Nikolay

Nikolay Ognyanov

nepročitano,

28. tra 2018. 17:11:3928. 04. 2018.

u antlr-discussion

Thank you all. Apparently no tool of the type I dream of is available. So I will give it a try myself here :
https://github.com/santlr/santlr . Will keep you posted.

Mike Lischke

nepročitano,

29. tra 2018. 05:45:4729. 04. 2018.

u azrdev via antlr-discussion

>
> Thank you all. Apparently no tool of the type I dream of is available. So I will give it a try myself here :
> https://github.com/santlr/santlr . Will keep you posted.

Ping me when you have something that is worth to be included in an IDE. I have no problem with adding yet another code generator (I already have 2 ANTLR4 jars in the extension). We can then work out how to visualize the results on a GUI page, if you like.

Mike
--
www.soft-gems.net

Nikolay Ognyanov

nepročitano,

29. tra 2018. 10:36:3329. 04. 2018.

u antlr-discussion

Hi Mike,

Thank you for the kind offer. I will notify you as soon as I have something usable. Probably not too soon though. I am just starting...

Regards
Nikolay

Nikolay Ognyanov

nepročitano,

1. svi 2018. 12:57:0101. 05. 2018.

u antlr-discussion

Hi Mike,

I believe that the public API of my SANTLR tool (less the code generation) is pretty much rounded and documented. The only area which is missing (and will be missing for some time) is drill down into conflict details.
Not being a GUI/IDE person myself, I would appreciate very much feedback from as to whether and how it makes sense from the standpoint of an IDE developer. The project is at https://github.com/santlr/santlr and the API javadoc is at https://santlr.github.io/ .

Regards
Nikolay

On Sunday, April 29, 2018 at 12:45:47 PM UTC+3, Mike Lischke wrote:

Mike Lischke

nepročitano,

2. svi 2018. 10:20:2602. 05. 2018.

u azrdev via antlr-discussion

Hey Nikolay,

>
> I believe that the public API of my SANTLR tool (less the code generation) is pretty much rounded and documented. The only area which is missing (and will be missing for some time) is drill down into conflict details.

That looks like it could become a very useful tool. Important here is not only to point out what is wrong, but provide some help how to fix it. That might be non-trivial, e.g. with indirect left recursion, while for direct left recursion it’s almost irrelevant, since internally such a rule is converted to a non-left-recursive ATN structure by ANTLR4.

Another important information is the maximum lookahead used for a given input, which directly affects performance and memory footprint. Help to lower the lookahead would certainly be welcome! Could be there’s already some profiling info available related to that. However, in the Typescript runtime from Sam (which I use in my extension) the profiler classes are not complete yet.

> Not being a GUI/IDE person myself, I would appreciate very much feedback from as to whether and how it makes sense from the standpoint of an IDE developer. The project is at https://github.com/santlr/santlr and the API javadoc is at https://santlr.github.io/ .

I wonder why you want to work with an AST. It’s inferior to the parse tree and contains only informations about the input (while the parse tree contains context/invocation information, as well as the input details). All an AST does is to put the input symbols into an arbitrary tree structure. The parse tree provides this "by nature“.

Also, I believe the syntax highlighting feature shouldn’t be in SANTLR. It’s a pure text editor/IDE feature and only puts unrelated burden on the new tool.

Btw, when I we really come to include SANTLR in vscode, we will have to convert it to Typescript - jfyi.

Mike
--
www.soft-gems.net

Nikolay Ognyanov

nepročitano,

2. svi 2018. 11:07:3102. 05. 2018.

u antlr-discussion

Hi Mike,

Thank you for the feedback.

I work with AST because the parse tree has way too many irrelevant details in it and only gets in the way when I do analysis.

You are the IDE expert, so I trust you on the highlighting and will consider leaving it out at least in Phase 1.

I was not aware that you work in Typescript. If there is no way to interface to Java code from it then I am afraid I can be of no help. At this time I only do Java.

Regards
Nikolay

Mike Lischke

nepročitano,

6. svi 2018. 04:30:4606. 05. 2018.

u azrdev via antlr-discussion

>
> I was not aware that you work in Typescript. If there is no way to interface to Java code from it then I am afraid I can be of no help. At this time I only do Java.

No worries. Right after sending the mail I realized my note about Typescript was irrelevant. I had already planned to call the SANTLR jar like the others.

Mike
--
www.soft-gems.net

Nikolay Ognyanov

nepročitano,

7. svi 2018. 06:05:1607. 05. 2018.

u antlr-discussion

BTW, I am primarily an Eclipse user but I installed and used for the first time in my life
Visual Studio Code in order to have a look at your extension. It is great. Congratulations
and thank you for developing it.

Regards
Nikolay

Odgovori svima

Odgovori autoru

Proslijedi