Getting Probable keyword list for auto code completion

612 views
Skip to first unread message

Diyoda Sajjana

unread,
Sep 18, 2013, 10:03:17 AM9/18/13
to antlr-discussion
I am implementing an auto code completion functionality and I am using ANTLR parser for my project. I would like to know if there is a definite way of achieving this in ANTLR 4 API. I am happy to provide more detailed specification of my implementation. I have posted a question on SO on this.

Mike Lischke

unread,
Sep 18, 2013, 11:14:47 AM9/18/13
to antlr-di...@googlegroups.com

Diyoda

I am implementing an auto code completion functionality and I am using ANTLR parser for my project. I would like to know if there is a definite way of achieving this in ANTLR 4 API. I am happy to provide more detailed specification of my implementation. I have posted a question on SO on this.

ANTLR can help with task but plays only a small role in an autocompletion implementation. What you want to show your user is not that an identifier or a string is next after a given position, but rather that a class member or reference is allowed there. The parser does not have this type of information as it only knows lexical elements.

You can of course use a parser to parse e.g. a class definition (which is a task on its own if your compilcation units can be large and you want to have quick response times). But everything else is left to you to find the necessary info. For a while I thought getting the follow set for a token would allow me to collect most of the needed information automatically but, as I wrote already, this will give you at most a list of allowed tokens (mostly your reserved words and a number of base types like identifiers, numbers and strings, depending on your grammar).

At least the allowed keywords would be a good help, but as soon as you also allow them to be identifiers they become useless as well (wrt to code completion).

The final blocking stone however is that most of the time you parse invalid input. More often than not you get only an error for your input. So, forget the use of a parser as the main component for your auto completion implementation.


Sam Harwell

unread,
Sep 19, 2013, 10:28:09 PM9/19/13
to antlr-di...@googlegroups.com

Hi Mike,

 

Your final statement regarding avoiding the use of a parser couldn’t be more wrong. GoWorks exclusively uses the ANTLR 4 parsing infrastructure to gather information necessary for code completion. ANTLRWorks 2 uses this as well, although the result is not quite as interesting because the grammar language is so simple.

 

That said, it does require heavy modifications of the default behavior to gather this information (at least in my implementation). Among other things, the code uses:

 

1.       The -Xforce-atn option, which forces the parser to use adaptivePredict even for decisions which appear to be statically LL(1). This allows me to “pretend” tokens are tokens of different types and see how it affects parsing. For example, if “object” is a keyword, with this option I can use a custom prediction algorithm to force “object” to be treated as an identifier so I can see what the results would look like if the user were 1 character away from typing “object1”.

2.       A class derived from ParserATNSimulator with custom analysis and control hooks.

3.       Many listener and visitor implementations.

4.       A custom error strategy implementation.

 

Sam

--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Mike Lischke

unread,
Sep 20, 2013, 3:05:08 AM9/20/13
to antlr-di...@googlegroups.com

Hey Sam,

thanks for you hints. However I did not say that a parser would be of no use, but only tried to put it in relation with additional code needed to get this working. In fact I also use the parser to get the AST and enough information to derive a decision what to show.

You know ANTLR in and out, better than anybody else. But most people don't have that inside knowlege nor the time to dig that deep into ANTLR (and the generated parsers). So my conclusion is that: use the parser for getting information that must be shown in the auto completion list (like class names, members etc. etc.), but for the determination of what to show use some heuristics (I should add: unless you know the generated parser better than the back of your hand).

Your final statement regarding avoiding the use of a parser couldn’t be more wrong. GoWorks exclusively uses the ANTLR 4 parsing infrastructure to gather information necessary for code completion. ANTLRWorks 2 uses this as well, although the result is not quite as interesting because the grammar language is so simple.
 
That said, it does require heavy modifications of the default behavior to gather this information (at least in my implementation). Among other things, the code uses:
 
1.       The -Xforce-atn option, which forces the parser to use adaptivePredict even for decisions which appear to be statically LL(1). This allows me to “pretend” tokens are tokens of different types and see how it affects parsing. For example, if “object” is a keyword, with this option I can use a custom prediction algorithm to force “object” to be treated as an identifier so I can see what the results would look like if the user were 1 character away from typing “object1”.
2.       A class derived from ParserATNSimulator with custom analysis and control hooks.
3.       Many listener and visitor implementations.
4.       A custom error strategy implementation.
Reply all
Reply to author
Forward
0 new messages