To Sam Harwell. ANTLR and Visual Studio (2005/2008) Colorizer.

138 views
Skip to first unread message

sskas...@gmail.com

unread,
May 29, 2013, 1:46:27 PM5/29/13
to antlr-di...@googlegroups.com
Hi. I read your blog and StackOverflow posts about using ANTLR3 for Visual Studio Language Service. And I need advice about Colorizer robust approach. As you mentioned, when user types a text it almost every time incorrect from grammar point of view. I have a solution for whole file that's works perfectly.
But in VS Colorizer I need to handle one line per colorization operation and that's really big problem where I need advice. (I currently use ANTLR3.5 and so cold 'old' api for VS to get LanguageService compatible with VS 2005/2008)

So suppose that token can be multiline, the token continuation character is '\'. Suppose additionaly that it could have an comments inside continued token .
In colorizer I want to highlight keywords and class names.
I have a source code input file like this:

i\
n\
t //that stands for 'int' - it is a keyword and must be highlighted properly

OR:

i\
//....
//around about 100 lines of comments or blank lines like:


//...
nt //that stands for 'int' too and must be highlighted.

The question is simple, how I could do that highlighting trick without retokenizing full file with full Lexer for every line?

Sam Harwell

unread,
May 29, 2013, 5:02:43 PM5/29/13
to antlr-di...@googlegroups.com
This is a more complicated situation for the following reason: it is possible for the highlighting of a line to be directly affected by the contents of a line *after* it. If you look at a traditional C-style multiline comment, you'll notice that the highlighting of a line is only really affected by lines before it. If a previous line contained /*, and */ has not been reached, then the current line starts as a comment and proceeds to either a */ or the end of the line, whichever comes first. This allows you to use lexer modes (a built-in feature of ANTLR 4), with /* entering the block comment mode and */ leaving it. For line-by-line highlighting in Visual Studio, the 32-bit integer containing the state at the end of the line is used to store information about the lexer mode to start with on the next line.

My "robust" approach to highlighting using ANTLR 3.5 is actually a very complicated workaround for the fact that ANTLR 3.5 lexers did not include modes. To simulate modes, I created fragment rules for each "mode" and called those fragments instead of the automatically generated mTokens() method based on manually-maintained state information. This helps with many multiline tokens, but not ones which require information about text appearing on lines later.

One thing you could do here is use the semantic highlighting approach to handle the cases you described. Semantic highlighting is typically implemented as a lower-priority asynchronous operation that highlights tokens based on information which is either not available or would reduce the speed of the performance-critical syntax highlighting operation. This includes things like highlighting type, field, and method references. In your case, one of the semantic highlighting capabilities could be highlighting keywords which were broken across multiple lines. The semantic highlighting pass is often explicitly delayed to reduce its net performance impact on the application. While I would prefer something as simple as keyword highlighting to appear instantly, I think it's safe to assume that the actual uses of this feature would be infrequent (to say the least). Keywords which appear entirely on one line would be highlighted immediately.

--
Sam Harwell
Owner, Lead Developer
http://tunnelvisionlabs.com
--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


sskas...@gmail.com

unread,
May 30, 2013, 1:03:44 PM5/30/13
to antlr-di...@googlegroups.com
Yes, I need to color identifiers splitted across multiple lines. And you are absolutely right, every line is affected by Previous and Next lines content. More over, keywords are not fixed, and information about what is a concrete token: identifier or keyword only accessible after a parsing phase. Quick sample:

i\
f( a == b) ... // here 'if' - is a keyword an need to be highlighted as a keyword

BUT:

if = a + b // here 'if' - is just an identifier and must be highlighted as an identifier

So, colorizing is heavily affected by semantic context of a code statement. If I understand you right, I need firstly gather all information from the parse and then, try to use it in my colorizer?
Can you explain what is Semantic highlighting asynchronous approach and how It could be implemented in VS (or just give a reference where I can read about it), please. I have a strong need to implement this feature.

sskas...@gmail.com

unread,
May 30, 2013, 1:21:21 PM5/30/13
to antlr-di...@googlegroups.com, sskas...@gmail.com
And another quick reference. I use the simular (to your) approach to strore information about colorizer lexer states in 32-bit integers (that maintained by VS itself), and I also use 'emulated' modes instead of calling mTokens() function. I know that there is another approach to maintain line state information not by a VS, but by Extension Package itself (but I still not found how to do that trick).

I also understand that token continuation info is really parser thing and could not be accessible in a colorizer lexer until full file parsing. So I think, that my only hope to properly colorize tokens is to store parsing information and use it somehow in colorizer. But I faced with this first time and really very confused (how to deal with this situation). Actually I have never heard how about 'semantic highlighting approach', especially how to do it as a lower-priority asynchronous operation. Is this must be done in colorizer or in something other plase (and if so, what is this 'magic' place)?

sskas...@gmail.com

unread,
Jun 1, 2013, 2:07:42 AM6/1/13
to antlr-di...@googlegroups.com, sskas...@gmail.com
Hi. I searched about 'how to implement semantic highlighting', but on VSX forum form Micorsoft there is nothing about it, like on StackOverflow. On old [antrl-interest] channel you mentioned something like it (that you using some like it in UnrealScript), but no link, description or sample is given.
I would be really appreciated if you give me a link where I could read about it (especially how to implement it).


sskas...@gmail.com

unread,
Jul 17, 2013, 2:08:24 PM7/17/13
to antlr-di...@googlegroups.com
Still waiting for response. If any useful link exists, could you share it, Sam?

Sam Harwell

unread,
Jul 18, 2013, 1:27:06 AM7/18/13
to antlr-di...@googlegroups.com
I don't have any publicly available sample code or documentation for this feature at this time. Currently we only provide this support under custom development contracts.

Thank you,
--
Sam Harwell
Owner, Lead Developer
http://tunnelvisionlabs.com

-----Original Message-----
From: antlr-di...@googlegroups.com [mailto:antlr-di...@googlegroups.com] On Behalf Of sskas...@gmail.com

sskas...@gmail.com

unread,
Jul 22, 2013, 1:02:05 PM7/22/13
to antlr-di...@googlegroups.com
Thanks for response, Sam. I was ready to hear something like that, but hoped, that there is some "open" (not commercial) info. Just one final question. Do you plan write *any* paper (or post) about this feature in a near future? And if so, please, let me know.

Many thanks.

Sam Harwell

unread,
Jul 22, 2013, 6:35:32 PM7/22/13
to antlr-di...@googlegroups.com
I don't have any plans to do that right now. Also, if I did write public documentation on this feature, it would likely target the Visual Studio 2010+ SDK, which is substantially different from prior versions of Visual Studio and would be very difficult to reuse in the earlier environments.

Thank you,
--
Sam Harwell
Owner, Lead Developer
http://tunnelvisionlabs.com

-----Original Message-----
From: antlr-di...@googlegroups.com [mailto:antlr-di...@googlegroups.com] On Behalf Of sskas...@gmail.com
Sent: Monday, July 22, 2013 12:02 PM
To: antlr-di...@googlegroups.com
Subject: Re: [antlr-discussion] To Sam Harwell. ANTLR and Visual Studio (2005/2008) Colorizer.

sskas...@gmail.com

unread,
Jul 23, 2013, 2:51:23 PM7/23/13
to antlr-di...@googlegroups.com
Thanks again. I close this topic now. And the last but not the least, could it be possible create here a thread or label for using ANTLR v3 or v4 for Visual Studio Extensibility? I believe that it would be very useful for folks like me.
Your blog posts gives a good starting point, but there is a number of tricky things like one that I mentioned here, that requires a better description/discussion.
And to mark such threads about using ANTLR and VS SDK I propose to create a label "VS SDK" here. May be this proposal might be done to Ter.
Reply all
Reply to author
Forward
0 new messages