Why does this work?

39 views
Skip to first unread message

Steve Tuckner

unread,
Dec 11, 2011, 9:17:22 AM12/11/11
to Treetop Development
I have a grammar that now parses the following string:

('123' = \"red\" and '456' = \"large\") or '456' = \"small\"

grammar RemedyQualifications
rule expression
simple_match ws 'and' ws expression <AndExpression> /
simple_match ws 'or' ws expression <OrExpression> /
simple_match
end

rule simple_match
'(' ws expression ws ')' <ParenthesizedExpression> /
field ws '=' ws value <SimpleMatch>
end

rule field
"'" field_id:[0-9]+ "'" <Field>
end

rule value
'"' field_value:[A-Za-z0-9 ]+ '"' <Value> /
"$NULL$" <NullValue>
end

rule ws
[ ]*
end
end

It failed to parse when I had the line:

'(' ws expression ws ')' <ParenthesizedExpression> /

as the first entry in the expression rule. Why is that? I am happy
that it works but I don't understand why.

Thanks for any insights,

Steve

markus

unread,
Dec 11, 2011, 4:12:05 PM12/11/11
to treet...@googlegroups.com

> I have a grammar that now parses the following string:
>
> ('123' = \"red\" and '456' = \"large\") or '456' = \"small\"
>
> grammar RemedyQualifications
> rule expression
> simple_match ws 'and' ws expression <AndExpression> /
> simple_match ws 'or' ws expression <OrExpression> /
> simple_match
> end
>
> rule simple_match
> '(' ws expression ws ')' <ParenthesizedExpression> /
> field ws '=' ws value <SimpleMatch>
> end
> :

> It failed to parse when I had the line:
>
> '(' ws expression ws ')' <ParenthesizedExpression> /
>
> as the first entry in the expression rule. Why is that? I am happy
> that it works but I don't understand why.

So (omitting details for clarity) when the rule

'(' ws expression ws ')' /

is at the top of expression it tries to parse

('123' = \"red\" and '456' = \"large\") or '456' = \"small\"

as a single expression that starts with with '(' and ends with ')', but
this doesn't match (the ')' is in the middle of the expression, right
before the 'or'); and this is the only way the grammar can form an
expression starting with a ')'.

An even easier way of saying it is that the first form was "wrong" and
what you have now is "right"; parentheses, as you are trying to use
them, are a way to form a simple_match out of a full expression (just as
in math they can be used to form a simple term out of a complex
expression). They are part of the syntax of simple_matches, not of
expressions, and therefore belong there.

Hope that helps,
-- MarkusQ

Steve (Gmail)

unread,
Dec 12, 2011, 11:27:09 PM12/12/11
to treet...@googlegroups.com
That sounds right in that it didn't match the complete expression, but
I did try at another point the following and it didn't work either

'(' expression ')' /
'(' expression ')' and expression /
'(' expression ')' or expression /

markus

unread,
Dec 13, 2011, 1:23:01 PM12/13/11
to treet...@googlegroups.com
On Mon, 2011-12-12 at 22:27 -0600, Steve (Gmail) wrote:
> That sounds right in that it didn't match the complete expression, but
> I did try at another point the following and it didn't work either
>
> '(' expression ')' /
> '(' expression ')' and expression /
> '(' expression ')' or expression /

Assuming you meant something like:

'(' ws expression ws ')' ws 'and' ws expression /
'(' ws expression ws ')' ws 'or' ws expression /

(with "and" and "or" quoted and white space handling, etc.) that would
have worked if you'd put them before the case with no operator like so:

rule expression
'(' ws expression ws ')' ws 'and' ws expression /
'(' ws expression ws ')' ws 'or' ws expression /


'(' ws expression ws ')' /

simple_match ws 'and' ws expression /
simple_match ws 'or' ws expression /
simple_match
end

but the way you did it the expression rule would still match on the no
operator case just as it had before. Remember, PEGs don't take the
longest match, or the best match, or the most appropriate match, or
anything like that. They take the first match, blithely move on, and
never look back. That gives them power and simplicity, but can and will
bite you until you get used to thinking that way.

-- MarkusQ


Steve (Gmail)

unread,
Dec 13, 2011, 1:26:28 PM12/13/11
to treet...@googlegroups.com
Ok, that explains it. Thanks!
Reply all
Reply to author
Forward
0 new messages