I get that, I do. I think there should be an option in Antlr to include or not. Comments do matter, even overdone ones. We are using Antlr as a preprocessor for our analysis tools and its done pretty well. Its just that as we see more and more code generators and other tools, we seem to be seeing a lot more actual information in comments. So as I mentioned above, we "backfit" comments into the stream. Its not ideal, but it does work.
in line comments are particular pesky but they rarely have additional information, or even code readable information.
Full Line comments occasionally contain SBoM information, and sometimes the fingerprint for the module.
If it were a perfect world, I guess I have talked myself into having two sets of comment processes that dovetail. On my "to do" list is to modify the language comments so that they operate rationally.
//
// Whitespace and comments
//
WS : [ \t\r\n\u000C]+ -> skip
;
COMMENT
: '/*' .*? '*/' -> skip
;
LINE_COMMENT
: '//' ~[\r\n]* -> skip
;
Leave line_comment alone and simply grab comment
COMMENT
: '/*' COMMENT_DATA*? '*/'
;
COMMENT_DATA
: '!VSC' STRUCTURED_COMMENT
| UNSTRUCTURED_COMMENT
;
I think this discussion is haunted by Perfect being the enemy of good enough. Having now analyzed several thousand programs ( literally) I have found less than 5 situations where an inline comment was important.
Just a stray thought.