Hi Fidel,
I would use lexer states and the filter method to handle such strings. I would define generic multiline_string_start and multiline_string_end tokens and use a filter to match them:
...
States
normal, string_body;
...
Tokens
{normal} multiline_string_start = '[' '='* '[';
{string_body} multiline_string_end = ']' '='* ']';
{string_body} multiline_string_character = any_character;
The filter for multiline_string_start would record the number of '=' characters and change the state to string_body. The filter for multiline_string_end would count the number of '=' characters and, if it doesn't match the recorded number, it would pushback the 2nd to last characters into the BufferedReader (in reverse order, of course) and replace the multiline_string_end token with a multiline_string_character corresponding to the first ']' character. If the number of '=' characters matches the recorded number, the state would be changed back to normal.
A deterministic finite automaton (as used by SableCC lexers) doesn't have memory to match exact '=' counts. An LR automaton (as used by SableCC parsers) has a single stack; this is sufficient to properly handle recursion, but it's insufficient to match '=' counts or, similarly in a programming langage with declarations, to detect whether a variable has been declared or not. Such counting requires a Turing Machine (which is equivalent to a DFA with two stacks or an infinite tape). SableCC encourages (when possible) to delay semantic analysis (which is the analysis of everything not already handled by the lexer's DFA and the parser's LR automaton) to a separate phase that follows lexing and parsing, using tree traversals with the DepthFirstAdaptor class. But, sometimes, like with multiline strings, some semantics must be handled during lexing and parsing and can't be delayed later. The filter() methods of the Lexer and Parser classes provide the ability to use a Turing-complete langage (Java) to analyze semantics during lexing and parsing, and, affect the output.
In other words, there's no way to escape the use of the filter method to handle the case you provided, which requires counting '=' characters for lexical matching.
Have fun!
Etienne