Hey all -
I'm happy to announce the release of version 0.1.2 of
Glow, the syntax highlighting library I open-sourced about a month ago.
This release comes with a number of massive changes as to the underlying nature of the library. In essence, I've moved away from a regular-expression driven approach in favor of using a lexical parser to build up a parse tree from the input string, which is then transformed into a syntax-highlighted string.
This has two advantages. First, it avoids a major problem I was having with the regular expression approach in the form of a StackOverflowError when attempting to identify long string literals within the input string. Second, it produces an intermediate parse tree which can be transformed using fairly standard Instaparse transformations to achieve arbitrary formatting output (handy if, for instance, one wanted to produce HTML output or hiccup templates).
It has some disadvantage as well. It's somewhat slower, though I believe the code to be more maintainable and comprehensible. It also requires the input source code to be valid and parseable Clojure code, which you can think of as either an advantage or a disadvantage :P
Although I had hoped initially to use Instaparse for the project's parser, I ended up getting considerably better performance from ANTLR / clj-antlr, and so that's the path I've gone down. You can take a look at the ANTLR Clojure grammar Glow uses
here - I believe it to be comprehensive, and it handles a number of edge cases that the grammar in the main ANTLR grammar repo does not.
Cheers,
- V