How to implement Auto completion using ANTLR4

594 views
Skip to first unread message

pranshu agarwal

unread,
Jul 27, 2017, 2:06:44 AM7/27/17
to antlr-discussion

Hi,

I want to implement auto code completion functionality for Verilog and VHDL languages and using ANTLR4 parser for my project.

I have seen GoWorks has implemented code auto code completion using ANTLR4, but I am more curious to know -

1.  How is it handling the cases when there are errors before caret position ?

2. Does it require to write the extra grammar rule to handle the error cases ?

Mike Lischke

unread,
Jul 27, 2017, 2:39:18 AM7/27/17
to antlr-di...@googlegroups.com
I have seen GoWorks has implemented code auto code completion using ANTLR4, but I am more curious to know -

1.  How is it handling the cases when there are errors before caret position ?

It cannot. If there's an error before the caret position you cannot predict what the user intended. You can try to determine if the error is just a simple one (omission, typo etc), but that's just guesswork and may often lead to wrong suggestions.


2. Does it require to write the extra grammar rule to handle the error cases ?

What rules do you have in mind? How can you predict which errors will occur? And again, with syntax errors you cannot even find the place in the grammar/ATN from where you would start collecting candidates.

For a concrete code completion implementation for ANTLR4 based parsers see https://github.com/mike-lischke/antlr4-c3 (or the Kotlin translation of this code from Federico  Tomassetti https://github.com/ftomassetti/antlr4-c3-kotlin).


pranshu agarwal

unread,
Jul 27, 2017, 5:50:13 AM7/27/17
to antlr-discussion
Thanks Mike.

It's strange that íf there are any errors before the caret position than we can't find it's caret context.

do you want to say that a new parser needs to be implemented which cares about scope only and does not check for the strict syntax ?

Currently most of the IDE's have the code completion feature like intellisense, I am curious that how they have implemented code completion.

please suggest other ideas to implement code completion which works well in case of the error(before caret position). 

Mike Lischke

unread,
Jul 27, 2017, 6:08:40 AM7/27/17
to antlr-di...@googlegroups.com

> It's strange that íf there are any errors before the caret position than we can't find it's caret context.

How can a navigator give you the path to target when one of the roads leading to it does not exist anymore (or is blocked)? With a syntax error you cannot find the prediction point from where to collect candidates, regardless which parser is used.

>
> do you want to say that a new parser needs to be implemented which cares about scope only and does not check for the strict syntax ?

No, I'm not suggesting something like this.

>
> Currently most of the IDE's have the code completion feature like intellisense, I am curious that how they have implemented code completion.

You can of course look at their source code to learn how they do it, but also they can't perform magic. If you don't have enough context information you cannot finish the process.

You can workaround the problem (e.g. by suggesting frequently used words, or suggest from a previous parse run where everything was still syntactically right), but this will always be suboptimal and won't change the fact that with errors before the invocation point you cannot predict candidates.

Mike
--
www.soft-gems.net

Sam Harwell

unread,
Jul 27, 2017, 6:54:46 AM7/27/17
to antlr-di...@googlegroups.com

Mike’s statement here isn’t precise. GoWorks includes several mechanisms intended to avoid the loss of code completion performance or accuracy in the case of errors (including those which appear before the caret). A more accurate statement would be:

 

There are cases where code completion will not perform in the event of an error before the caret. The “severity” of the error (from the perspective of the parser) as well as the proximity to the caret correlate with the impact on code completion.

 

The overall intent is code completion should not be impacted by code which is not visible on screen (out of sight, out of mind). Specific mechanisms involved in this include:

 

  1. Symbol and navigation information from previous versions of a file are not discarded from caches until a file parses correctly, so code completion can continue to reference these symbols while a user is typing.
  2. Code completion doesn’t attempt to parse the entire file, but instead starts as close to the caret as possible.

 

The restart points for the code completion, hover tips, smart indent, and other related features are called “anchor points”, and come in two forms. The first is reference anchor points. These are calculated and cached following a successful parse of the entire file. The second is called dynamic anchor points, which I intended to calculate as a modification to the reference anchor points using incomplete information following parse operations with errors. However, in practice I found that the reference anchor points alone performed so well that I never added the secondary logic.

 

Sam

--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mike Lischke

unread,
Jul 27, 2017, 7:48:48 AM7/27/17
to antlr-di...@googlegroups.com

Mike’s statement here isn’t precise. GoWorks includes several mechanisms intended to avoid the loss of code completion performance or accuracy in the case of errors (including those which appear before the caret). A more accurate statement would be:
 
There are cases where code completion will not perform in the event of an error before the caret. The “severity” of the error (from the perspective of the parser) as well as the proximity to the caret correlate with the impact on code completion.
 
The overall intent is code completion should not be impacted by code which is not visible on screen (out of sight, out of mind). Specific mechanisms involved in this include:
 
  1. Symbol and navigation information from previous versions of a file are not discarded from caches until a file parses correctly, so code completion can continue to reference these symbols while a user is typing.
  2. Code completion doesn’t attempt to parse the entire file, but instead starts as close to the caret as possible.

These are some important points Sam. However, after all they just confirm what I said: once you have the error in code (before the caret) you cannot directly provide code completion from that. You use cached content instead and/or try to "fix" the syntax error in simple cases.

 
The restart points for the code completion, hover tips, smart indent, and other related features are called “anchor points”, and come in two forms. The first is reference anchor points. These are calculated and cached following a successful parse of the entire file. The second is called dynamic anchor points, which I intended to calculate as a modification to the reference anchor points using incomplete information following parse operations with errors. However, in practice I found that the reference anchor points alone performed so well that I never added the secondary logic.

Interesting idea. Fortunately with ANTLR4 it's easy to parse subparts of the entire language. What do you use as anchor points? ATN state + input position? And what are the criteria for creating an anchor point (they depend on the language being parsed)?


pranshu agarwal

unread,
Jul 27, 2017, 8:30:21 AM7/27/17
to antlr-discussion
Thanks Mike and Sam for your suggestions.

Sam/Mike, 

I got the point that I can cache the data till the successful parse, but for the next suggestion I need to know my caret position context.

For Example (C++), Caret position is inside the function but suppose i have an error in the function's parameter declaration list -

void fun(int a, int b double )
{
   //caret is inside function 
}

can this type of error be automatically recovered by the ANTLR4 or parser will stop at the error only? If it stop at the error than it is not possible to get the current context in this error case.
Reply all
Reply to author
Forward
0 new messages