Consider that I am parsing HTML (a very frequent occurrence), and wish
to make a Rule that matches a balanced tag from open to close. I want
to use the same code many different times, but for different tags. So I
really want to say something like:
rule baltag (Rule|Str $<tag>) {
\< $<tag> \s* $<options> := (.*?) \>
$<body> := (.*?)
\</ $<tag> \>
}
I could then do:
$buffer ~~ / <baltag title> /;
later on to match any <title> tag in my buffer.
I'm open to alternative syntaxs, this one was just there to illustrate
my point.
-- Rod Adams.
No no no! That's too powerful.
Wow, skimming through both S5 and A5 and I see no mention of such a
thing. I know we've had it planned for quite a while.
> Consider that I am parsing HTML (a very frequent occurrence), and wish
> to make a Rule that matches a balanced tag from open to close. I want
> to use the same code many different times, but for different tags. So I
> really want to say something like:
>
> rule baltag (Rule|Str $<tag>) {
> \< $<tag> \s* $<options> := (.*?) \>
> $<body> := (.*?)
> \</ $<tag> \>
> }
Replace $<tag> with $tag and you're all set. We may allow putting
$<tag> directly in the parameter list for inclusion in the parse tree.
Luke
No problem. That's how the arguments to rules like <before foo> are
already passed. If I recall, we originally specified three basic forms:
<foo bar> # bar is pattern
<foo: bar> # bar is string
<foo(bar)> # bar is Perl expression
though the middle one of those is the weakest, since it's equivalent to
<foo('bar')>
For that matter, the first one is just
<foo(/bar/)>
Anyway, these forms are somewhat negotiable yet. In recognition that
these are all methods underneath, I seem to recall switching the most
generic form to
<.foo()>
at some point, but I could be hallucinating. We could certaintly just
get by with
<foo pattern>
and
<.foo(@arguments)>
But I kind of like the sub syntax even if they're really methods underneath.
I dunno...
Of course, one can call them like ordinary methods too, as long as one
supplies an appropriate pattern-matching context invocant. But it's
pretty handy to have that magically supplied for you inside <...>.
Larry
Yes, this is written in A05, although it's often hard to spot and
easy to overlook. They're in the large table under "Metacharacter reform":
<name(expr)> # call rule, passing Perl args
{ .name(expr) } # same thing.
<$var(expr)> # call rule indirectly by name
{ .$var(expr) } # same thing.
<name pat> # call rule, passing regex arg
{ .name(/pat/) } # same thing.
# maybe...
<name: text> # call rule, passing string
{ .name(q<text>) } # same thing.
The argument form of subrules is not currently mentioned in S05.
I've been designing and implementing PGE consistent with the
above syntaxes.
Pm
>On Tue, Mar 01, 2005 at 11:06:17PM -0600, Rod Adams wrote:
>: Since the line between rules and subs is already blurring significantly,
>: I want to blur it a little more. I want to write rules which can take
>: parameters.
>
>No problem. That's how the arguments to rules like <before foo> are
>already passed.
>
Excellent!
>Of course, one can call them like ordinary methods too, as long as one
>supplies an appropriate pattern-matching context invocant. But it's
>pretty handy to have that magically supplied for you inside <...>.
>
>
Now for the tricky part.
rule baltag (Str $tag, Str $body is rw) {
\< $tag .*? \>
$<body> := (.*?)
\</ $tag \>
}
$buffer ~~ / <baltag title $<body>> .* \> <$body> \< /;
In other words, I want to pass a possibly unbound hypothetical into a
subrule, and let the subrule bind/rebind it.
Alternatively would be a syntax for calling the subrule, and then
binding a hypothetical to one of the hypotheticals returned in the
subrule. I'm moderately sure S05 made this possible, but I couldn't put
all the pieces together. I'll hazard the following guess:
rule baltag (Str $tag) {
\< $tag .*? \>
$<body> := (.*?)
\</ $tag \>
}
$buffer ~~ / $<btag> := <baltag title> <($<body> := $btag<body>)> /;
Which seems very clumsy, especially if I wish to call it a lot. Be nicer
if I could push the work onto the subrule.
>Larry
>
>
>
>
Thanks for pointing that out, Patrick. I'm impressed with how you've
assimilated all the S's & A's. (And yes, I love that the guy in charge
of implementing the language has that ability.)
And now some questions to hammer out some details on passing args to
subrules:
What if you wish to pass two args, the first a string, the second a rule?
Are you then forced to use <name(expr, expr)> syntax?
Does <name: text1 text2> get handled as <name(q<text1 text2>)> or as
<name(q<text1>, q<text2>)>, in which case it's really qw//?
If I declare my rule as:
rule MyRule (Str $text) { ... }
Will the P6RE be smart enough to pass:
m:{<MyRule /thisdir/>}
As a string, not the rule that it looks like?
If I define a named rule, how do I get a reference to it from outside a
rule?
A05 leads me to think that
my $rx := Rule.MyRule;
will work, assuming Rule is the default gramme. I would also suspect
that one could reverse this.
my $rx = rule { ... };
Rule.MyRule := $rx;
Thus defining a new rule dynamically.
-- Rod Adams
Yes, Patrick is a jewel. We'll probably wear some new facets on him though.
: And now some questions to hammer out some details on passing args to
: subrules:
:
: What if you wish to pass two args, the first a string, the second a rule?
: Are you then forced to use <name(expr, expr)> syntax?
Yes, or pass a string and parse it yourself.
: Does <name: text1 text2> get handled as <name(q<text1 text2>)> or as
: <name(q<text1>, q<text2>)>, in which case it's really qw//?
The former. It's a single string, which you can parse however you like.
Though I suppose we could extend the colon to a colon modifier:
<name:w text1 text2>
That's getting a little weird though, considering that in most other cases
such modifiers are outside the delimiters. Here's a really weird one:
<name:here END>
I'm more inclined to say that anything beyond a bare string has to use
function notation. I'm still not entirely sure we should even have a
string notation. Its utility/clutter ratio is pretty low.
: If I declare my rule as:
:
: rule MyRule (Str $text) { ... }
:
: Will the P6RE be smart enough to pass:
:
: m:{<MyRule /thisdir/>}
:
: As a string, not the rule that it looks like?
Probably not, given the late-binding issues and the desire to treat
patterns as first-class language. That's why we're differentiating
the call syntax without reference to the called signature. If you
want delayed compilation of the pattern, you can get it by passing
a string. But then any parentheses in it don't count as parens in
the outer rule. With <before ([A-Z]+)> we can treat the lookahead
parens as an ordinary capture because the parens are parsed at the
same time the outer rule is parsed.
: If I define a named rule, how do I get a reference to it from outside a
: rule?
: A05 leads me to think that
:
: my $rx := Rule.MyRule;
:
: will work, assuming Rule is the default gramme. I would also suspect
: that one could reverse this.
:
: my $rx = rule { ... };
: Rule.MyRule := $rx;
:
: Thus defining a new rule dynamically.
You have to use &Rule::MyRule to refer to the method by name. Rule.MyRule
would invoke the MyRule method as a class method in the Rule class, since
method calls do not require parens if there are no arguments.
Larry
Well, that would be written:
rule baltag (Str $tag, Str $body is rw) {
\< $tag .*? \>
$body := (.*?)
\</ $tag \>
}
since $body is a real variable rather than a hash entry. However,
I don't think binding works even in an ordinary sub--rebinding the
variable would break its association with whatever actual parameter.
You'd need to use assignment:
rule baltag (Str $tag, Str $body is rw) {
\< $tag .*? \>
(.*?)
\</ $tag \>
{ let $body = $1 }
}
: $buffer ~~ / <baltag title $<body>> .* \> <$body> \< /;
That'd have to be something like:
$buffer ~~ / <baltag(/title/,$<body>)> .* \> <$body> \< /;
But in theory it should autovivify the $<body> for you based on the "is rw".
: In other words, I want to pass a possibly unbound hypothetical into a
: subrule, and let the subrule bind/rebind it.
Can only assign it. You'd have to pass the actual symbol table holding
$<body> if you want to rebind the name. You could presumably pass
in $/ as an argument and hash it explicitly.
: Alternatively would be a syntax for calling the subrule, and then
: binding a hypothetical to one of the hypotheticals returned in the
: subrule. I'm moderately sure S05 made this possible, but I couldn't put
: all the pieces together. I'll hazard the following guess:
:
: rule baltag (Str $tag) {
: \< $tag .*? \>
: $<body> := (.*?)
: \</ $tag \>
: }
:
: $buffer ~~ / $<btag> := <baltag title> <($<body> := $btag<body>)> /;
That's essentially correct, though that should probably be {...}
instead of <(...)>, in case the body has the value "0" which would
cause the <(...)> assertion to fail. There might also be some
circumstances in which you need a "let" in there if there's any
possibility of backtracking over the binding but still having access
to the match object.
: Which seems very clumsy, especially if I wish to call it a lot. Be nicer
: if I could push the work onto the subrule.
Well, if you never want <baltag> to return the tags, then you can just
say something like:
rule baltag (Str $tag) {
\< $tag .*? \>
$0 := (.*?)
\</ $tag \>
}
$buffer ~~ / $<body> := <baltag title> /;
But you probably want to know what attributes that first .*? matched too.
Larry
I'm inclined to say this from a PGE implementation perspective, at
least for the short-term.
As for the rest, I agree with Larry. :-)
Pm