No, this is not about relationship troubles.
I am struggling to work with rejection events. I am trying to deal with constructs like preprocessing statements or meaningful comments in programming languages. These (i) can go anywhere in the grammar and (ii) need to be propagated into the parse tree and (iii) may affect the parse itself and (iv) cannot be easily parsed with a grammar or an internal lexer.
My idea to parse such constructs was to create lexemes invoked by fake G1 productions which would be tried when the relevant text is encountered and would create a rejection event. I would then parse the text of these constructs in an external recognizer upon handling the rejection event and insert the proper text back into the input string and set the continuation of the parse to the start of the replacement text. If the replacement text is legal at the inserted point, parsing should continue just fine, thanks to the great infrastructure provided by Marpa.
However, things did not go as planned. Please look at the attached example for detail. In this example, I try to handle preprocessor statements (#ifdef).
I created a very simple grammar, and added these productions:
fakecpp ::= cpp
cpp ~ '#'
The fakecpp production is actually not reachable. However, when in the input string, for example:
abc\n#ifdef A\n=\n#else\n+\n#endif\n12
When we hit the "#ifdef", we get a rejection event, and in the handler I thought I could clean it up:
$pos = $pos + $len - $newlen + 1;
substr($string, $pos, $newlen) = $cpp2;
($string is the original string, $pos is the current position, $len is the total length of the ifdef, $newlen is the length of the replacement text, and $cpp2 is the replacement text). I insert the replacement text at the end of the ifdef and set the position to before the replacement text. Now I hoped that upon resume the parser would get the replacement text and be happy.
No such luck. Please note that I got the following to work: Find out what lexeme was expected and read it with the external parser (lexeme_read), and proceed with the text after it.
$pos = $pos + $len + 1;
$recce->lexeme_read('OP', $pos, 1, '=');
But this approach only works because this grammar is so simple and I can easily deal with all cases of possible rejections by looking at the expected lexemes.
Note that if I put the "=" into the input string and try to continue parsing from before it, I get another rejection event at this very point. This is really strange because the grammar expects an OP, I give it an OP, but it cannot parse it.
Intuitively, there is something I must be doing wrong as it seems there should be a way of getting this to work.
Any suggestions would be greatly appreciated.
Thanks, Th.