Let's assume that you are using the grammars-v4/csharp grammar (you don't tell us what grammar you are using). Antlr Visitors and Listeners have a time and place, but a much cleaner approach is to use the Trash toolkit (
https://github.com/kaby76/Domemtech.Trash). You will need to first add a "class" declaration wrapper as the grammar is C#6-ish. After generating a parser using "trgen" and building, run "trparse" and pipe the parse tree data to "trquery" to remove the comments, and "trtext" to serialize the frontier of the parse tree. Trash makes all tokens that are not on the "default channel" as attributes. In XPath, attributes are referenced by an "at-sign".
$ trparse
test.in | trquery 'delete //method_body//(@DELIMITED_COMMENT | @SINGLE_LINE_COMMENT)' | trtext
CSharp 0
test.in success 0.1716372
class foo {
public void bark() {
Console.WriteLine("Bark Bark !!");
double resultComplex = ((10 + 5) * (20 - 7)) / (8 * 4);
Console.WriteLine("Complex Calculation: " + resultComplex);
int i = 0;
while (i < 15)
{
Console.WriteLine("Value of i: " + i);
int x = 10;
i++;
}
}
}
If you need to remove the extra "WHITESPACES", you will need to update the grammar to have a special lexer rule that recognizes blank lines, then pass the parse tree to a second "trquery" to delete those attributes within a method_body. XPath2 is limited in the kinds of string operations it can perform.
Ken