Perl Regexps has context sensitive capabilities to parse Sn(...)
perl -E '"S5(abc)d)" =~ /S(\d+)\(((??{".{$1}"}))\)/x && say $2'
This "trick" probably can be used to parse whole thing
http://perldoc.perl.org/perlre.html#(%3f%3f%7b-code-%7d)
> Marpa you'd use the Ruby Slippers and some lexer magic. For the arrays, the
> counts are purely redundant information, and can be simply be thrown away.
If we can throw away number of array elements then it's easy.
> This leaves the problem of matching parentheses. Dyck languages are LL(1),
> so within the capabilities of Recursive Descent, and well within those of
> Marpa. So, again, you can take your choice.
I think solution with Marpa::XS wouldn't be more complex than
RecDescent solution presented in the post. However, in this particular
case grammar and lexer are not ambiguous, but in ambiguos case
RecDescent has one important advantage. Actions get context and can be
used for "lexing". It's not bad that lexer is detached in Marpa, but
it's bad that lexer has to take care of gathering parsing context on
its own.
For example parsing diverged at some position into several token
streams, lexer moved further, you ask parser for expected terminals,
but the list is contextless as it doesn't give you access to the
stream(s) a terminal is expected in.
In such cases you'll have to move more out of grammar into lexer or
free grammar/lexer to deal with it later. For me moving things out of
grammar devalues solution.
It's just food for thoughts on improving high level interface on top
of marpa algorithm.
--
Best regards, Ruslan.