Re: Flavio Polletti's Hollerith-Dyck language

Showing 1-3 of 3 messages
Re: Flavio Polletti's Hollerith-Dyck language Ruslan Zakirov 4/10/12 8:50 AM
On Tue, Apr 10, 2012 at 05:46, Jeffrey Kegler
<> wrote:
> On the Perl blogs Flavio Polletti tackled a language using Parse::RecDescent
> and the question was asked: could Marpa do better?  I think the answer ends
> up as a a matter of taste, and of priorities in the specific situation.
> Here are my thoughts, untested, and off the top of my head:
> Here's an example of Flavio's language:
> A2(A2(S3(Hey)S13(Hello, World!))S5(Ciao!))
> Basically, this language is Hollerith constants inside a Dyck language
> (balanced parentheses).  Hollerith constants are named after the inventor of
> punched cards and went out about the same time.  Their demise was probably
> more because of the difficulties they present to parsing, then their tie to
> paper media.  Word got around that escaping delimiters was the easier choice
> and 15 years ago support for Hollerith constants was even removed from the
> language which pioneered them, Fortran.
> Hollerith constants are context-sensitive -- BNF cannot describe them.
> Therefore neither Parse::RecDescent or Marpa can parse them without some
> hackery.  For strings, Flavio uses a typical recursive descent hack, and in

Perl Regexps has context sensitive capabilities to parse Sn(...)

perl -E '"S5(abc)d)" =~ /S(\d+)\(((??{".{$1}"}))\)/x && say $2'

This "trick" probably can be used to parse whole thing

> Marpa you'd use the Ruby Slippers and some lexer magic.  For the arrays, the
> counts are purely redundant information, and can be simply be thrown away.

If we can throw away number of array elements then it's easy.

> This leaves the problem of matching parentheses.  Dyck languages are LL(1),
> so within the capabilities of Recursive Descent, and well within those of
> Marpa.  So, again, you can take your choice.

I think solution with Marpa::XS wouldn't be more complex than
RecDescent solution presented in the post. However, in this particular
case grammar and lexer are not ambiguous, but in ambiguos case
RecDescent has one important advantage. Actions get context and can be
used for "lexing". It's not bad that lexer is detached in Marpa, but
it's bad that lexer has to take care of gathering parsing context on
its own.

For example parsing diverged at some position into several token
streams, lexer moved further, you ask parser for expected terminals,
but the list is contextless as it doesn't give you access to the
stream(s) a terminal is expected in.

In such cases you'll have to move more out of grammar into lexer or
free grammar/lexer to deal with it later. For me moving things out of
grammar devalues solution.

It's just food for thoughts on improving high level interface on top
of marpa algorithm.

Best regards, Ruslan. 4/9/12 6:46 PM <This message has been deleted.> 4/11/12 9:31 PM <This message has been deleted.>