>>On Thu, Apr 03, 2003 at 07:29:37AM -0800, Austin Hastings wrote:
>>
>>
>>>This has been alluded to before.
>>>
>>>What would /A*B*/ produce?
>>>
>>>Because if you were just processing the rex, I think you'd have to
>>>finish generating all possibilities of A* before you began iterating
>>>over B*...
>>>
>>>
>>The "proper" way would be to first produce all possibilities of length n
>>before giving any possibility of length n+1.
>>
>>''
>>'A'
>>'B'
>>'AA'
>>'AB'
>>'BB'
>>'AAA'
>>'AAB'
>>...
>>
>>I haven't spent a milisecond of working out whether that's feasible to
>>implement, but from a theoretical POV it seems like the solution.
>>
>>
>
>Well, I'm not certain there is really a "proper" way. But sure, your
>way is doable.
>
> use Permutations <<permutations compositions>>;
>
> # Generate all strings of length $n
> method Rule::Group::generate(Int $n) { # Type sprinkles :)
> compositions($n, +@.atoms) ==> map {
> my @rets = map {
> $^atom.generate($^n)
> } zip(@.atoms, $_);
> *permutations(*@rets)
> }
> }
>
>How's that for A4 and A6 in a nutshell, implementing an A5 conept? :)
>I hope I got it right....
>
>Provided each other kind of rx element implemented generate, that
>returned all generated strings of length $n, which might be zero.
>This would be trivial for most other atoms and ops (I think).
>
>Oh, compositions($a,$b) is a function that returns all lists of length
>$b whose elements sum to $a. Yes, it exists.
>
>I have a couple syntax questions about this if anyone knows the answers:
>
> $^atom.generate($^n)
>
>I want @rets to be an array of array refs. Do I have to explicitly
>take the reference of that, or does that work by itself?
>
> zip(@.atoms, $_)
>
>I want the array ref in $_ to be zipped up with @.atoms as if $_ were
>a real array. If this I<is> correct, am I allowed to say:
>
> zip(@.atoms, @$_)
>
>for documentation?
>
>Also, related to the first question:
>
> *permutations(*@rets)
>
>Does that interpolate the returned list from permutations right into
>the map's return, a la Perl5? Do I need the * ?
>
As far all of these questions, I think the answer is related. I think
the general question is "Is implicit flattening needed for perl6
builtins?" I think that the answer is no, or at least should be no,
because it won't be hard to get the builtins to DWIM because of
multimethods.
For instance, an implementation of map might be:
sub *map (&code, Array @array) {
return @array.map(&code);
}
sub *map (&code, *@rest) {
my @ret;
for @rest {
@ret.push( &code.($_) );
}
return @ret;
}
So, given an Array/Array Subclass/Reference to one of the two as the
2nd argument to map, map would call the method version of map;
otherwise, the arguments after the code block are flattened and
looped over.
This behaivor should be consistant across all of the perl6 builtins.
Joseph F. Ryan
ryan...@osu.edu
Except that it should be:
multi *map (&code, Array @array) {
return @array.map(&code);
}
multi *map (&code, *@rest) {
my @ret;
for @rest {
@ret.push( &code.($_) );
}
return @ret;
}
I swear, my brain must hate me; I always overlook the most obvious
mistakes. (-:
Joseph F. Ryan
ryan...@osu.edu
>>making *productions* of strings/sounds/whatever that could possibly
>>match the regular expression?
>>
>>
>>>Correct me if I am wrong, but isn't this the :any switch of apoc 5?
>>>http://www.perl.com/pub/a/2002/06/26/synopsis5.html
>>>
>
>Not really, unless the input string is infinite!
>
Well, thats just in the general purpose case, right? That's because
a regex like /a*/ matches:
'w'
'qsdf'
'i bet you didn't wnt this to mtch'
So, you're going to need some sort of controlled input to a regex match
with the :any for it to work right.
Here's my approach to the problem: generate a possible string that
could match every atom in the regex individually, and then generate
matches for the whole regex off of that. I liked Luke's approach
of stapling methods onto the Rx classes, so I used an approach that
made use of that idea. I completed each of the needed rules, since
the methods in my example are pretty simple (they probably would be
in Luke's example too, but I just wanted to be sure I wasn't missing
anything).
use List::Permutations <<permutations>>; # Perl 5's name.
sub generate (rx $spec, Int $limit) {
my $string = $spec.generate_match (&propagate, $limit);
$string =~ m:any/ (<$spec>) { yield $1 } /;
my sub propagate ($atom) {
given ($atom) {
when Perl::sv_literal {
$string ~= $_.literal()
}
when Perl::Rx {
$string ~= .generate_match (&propagate, $limit)
if .isa(generate_match)
}
}
}
}
Perl::Rx::Atom::generate_match (&p, $limit) {
return &p.($.atom)
}
Perl::Rx::Zerowidth::generate_match (&p, $limit) {
return &p.($.atom)
}
Perl::Rx::Meta::generate_match (&p, $limit) {
return join '', $.possible
}
Perl::Rx::Oneof::generate_match (&p, $limit) {
return join '', $.possible
}
Perl::Rx::Charclass::generate_match (&p, $limit) {
return join '', $.possible
}
Perl::Rx::Sequence::generate_match (&p, $limit) {
my $string;
$string ~= &p.($_) for $.atoms;
return $string;
}
Perl::Rx::Alternation::generate_match (&p, $limit) {
my $string;
$string ~= &p.($_) for $.branches;
return $string;
}
Perl::Rx::Modifier::generate_match (&p, $limit) {
my $string;
$string ~= &p.($_) for $.atoms;
# is $self ($.) still the topic here? or is the last
# member of $.atoms?
return $self.mod.transform($string);
}
Perl::Rx::Modifier::repeat (&p, $limit) {
$string := join '', map { join '', $_ }
permutations (split //, &p.($.atom)) xx ($.max // $limit);
return $string;
}
So, given a call like:
generate (/(A*B*(C*|Z+))/, 4);
The C<$string> variable in the 2nd line of C<generate> would become:
AAAABBBBCCCCZZZZ
And the :any switch takes care of the rest. (-:
Joseph F. Ryan
ryan...@osu.edu
sorry , it was proposed to be like that
sub peek_at_sky {
my Color @numbers = peek_with_some_hardware;
my Str @words = map { "1" but color($_) } @numbers
my $say_it is from( @words ) ;
return $say_it ;
}
rule color { (.) { let $0 := $1.color } }
$daylight = &peek_at_sky =~ /<color>+/; # is something in sky
The "proper" way would be to first produce all possibilities of length n
before giving any possibility of length n+1.
''
'A'
'B'
'AA'
'AB'
'BB'
'AAA'
'AAB'
...
I haven't spent a milisecond of working out whether that's feasible to
implement, but from a theoretical POV it seems like the solution.
--
Matthijs van Duin -- May the Forth be with you!
But that's the point - I don't want it to be just able to generate all possibilities,
I want it to be able to generate a subset of valid possibilities. And have:
a) a default heuristic for doing so, based on a regex
b) user defined heuristics for doing so
Although I disagree with you on the idea that it has no uses as is - generating all
possible combinations. You could do:
my @list is Regex::Generator(/([1-6])([1-6^\1])([1-6^\1\2])/)
to return a list of all combinations of numbers between 1 and 6 and:
my @words = qw( word list number one );
my @words2 = qw( word list number two );
my @list is Regex::Generator(/ (@words) (@words2) /);
to generate all possible combinations of words. You could also test hard to understand
rexen by simplifying and generating all possible combinations:
my $_doublestring = q$(?:\"(?>[^\\\"]+|\\\.)*\")$;
becomes
my $_doublestring = q$(?:\"(?>[notdq]+|\\\")*\")$;
to generate:
""
"n"
"o"
"t"
...
"\""
>
> But I guess then you'd see a lot more quantifiers and such.
>
> /\w+<8>: \d<4>/
or substituting \w for something more manageable like [a-f] and \d for [1-2].
> Is finite (albeit there are 63**8 * 10**4 == 2,481,557,802,675,210,000
> combinations). References to the heat death of the universe, anyone?
>
> And then there's Unicode. %-/
> In reality, I don't think it would be that useful. Theoretically,
> though, you *can* look inside the regex parse tree and create a
> generator out of it... so, some module, somewhere.
Of course, it would need a little elbow grease to be truly useful. The syntax for
making heuristics in generating useful productions would take some work. But I can think
of a dozen uses for it.
Ex: Right now, I'm writing a generator to generate sample programming problems - for a
book I'm writing. It spits out both the problem, and the code to answer the problem..
Using a production engine like the one above, and this problem generator becomes 20
lines of code.
Ed
>>What I think you're looking for is the fact that they're not regexes any more. They are > "rexen", but in horrifying-secret-reality, what has happened is that Larry's decided
>>to move Fortran out of core, and replace it with yacc.
>>
>>
>
>just an aside, and a bit off-topic, but has anybody considered hijacking the regular
>expression engine in perl6 and turning it into its opposite, namely making *productions*
>of strings/sounds/whatever that could possibly match the regular expression? ie:
>
>a*
>
>producing
>
>''
>a
>aa
>aaa
>aaaa
>
>etc.
>
Correct me if I am wrong, but isn't this the :any switch of apoc 5?
http://www.perl.com/pub/a/2002/06/26/synopsis5.html
Joseph F. Ryan
ryan...@osu.edu
Yeah, it seems like a neat idea. It is if you generate it
right... but, fact is, you probably won't. For anything that's more
complex than your /a*/ example, it breaks down (well, mostly):
/\w+: \d+/
Would most likely generate:
a: 0
a: 00
a: 000
a: 0000
Or:
a: 0
a: 1
...
a: 9
a: 00
a: 01
ad infinitum, never getting to even aa: .*
But I guess then you'd see a lot more quantifiers and such.
/\w+<8>: \d<4>/
Is finite (albeit there are 63**8 * 10**4 == 2,481,557,802,675,210,000
combinations). References to the heat death of the universe, anyone?
And then there's Unicode. %-/
In reality, I don't think it would be that useful. Theoretically,
though, you *can* look inside the regex parse tree and create a
generator out of it... so, some module, somewhere.
> Now, just got to think of the syntax for it.. how to make it usable.
That's easy:
use Regex::Generator;
my @list is Regex::Generator(/a*/);
for @list {
dostuff($_)
}
That or an iterator.
Luke
Larry
> Yary Hluchan wrote:
>
>>>making *productions* of strings/sounds/whatever that could possibly
>>>match the regular expression?
>>>
>>>
>>>>Correct me if I am wrong, but isn't this the :any switch of apoc 5?
>>>>http://www.perl.com/pub/a/2002/06/26/synopsis5.html
>>>>
>>
>>Not really, unless the input string is infinite!
>>
>
>
> Well, thats just in the general purpose case, right? That's because
> a regex like /a*/ matches:
>
> 'w'
> 'qsdf'
> 'i bet you didn't wnt this to mtch'
>
> So, you're going to need some sort of controlled input to a regex match
> with the :any for it to work right.
>
> Here's my approach to the problem: generate a possible string that
> could match every atom in the regex individually, and then generate
> matches for the whole regex off of that. I liked Luke's approach
> of stapling methods onto the Rx classes, so I used an approach that
> made use of that idea. I completed each of the needed rules, since
> the methods in my example are pretty simple (they probably would be
> in Luke's example too, but I just wanted to be sure I wasn't missing
> anything).
[...]
>
> So, given a call like:
>
> generate (/(A*B*(C*|Z+))/, 4);
> The C<$string> variable in the 2nd line of C<generate> would become:
>
> AAAABBBBCCCCZZZZ
>
> And the :any switch takes care of the rest. (-:
Sadly, no it doesn't, the rule given should match 'AZZZ', but that's
not a substring of the string you generated.
--
Piers
You're right. I guess the only true way to do it would to perform the
process for each atom, and then concat the results. Unfortunately, this
does bloat my solution:
use List::Permutations <<permutations>>; # Perl 5's name.
sub generate (rx $spec, Int $limit) {
my @atoms = get_tokens();
while (1) {
my $string;
for @atoms {
$string ~= $_();
}
yield $string;
}
my sub get_tokens() {
my @atoms;
given ($atom) {
when Perl::Rx::Sequence
| Perl::Rx::Alternation
| Perl::Rx::Modifier
{
@atoms = .atoms;
}
default {
@atoms = (.atom);
}
}
my @ret;
for @atoms {
my $x = .generate_match (&propagate, $limit);
@ret.push ( {$x =~ m:any/(<$_>){ yield $1 }/} )
}
}
my sub propagate ($atom) {
given ($atom) {
when Perl::sv_literal {
$string ~= .literal
}
when Perl::Rx {
$string ~= .generate_match (&propagate, $limit)
if .isa(generate_match)
}
}
}
}
Perl::Rx::Atom::generate_match (&p, $limit) {
return &p.($.atom)
}
Perl::Rx::Zerowidth::generate_match (&p, $limit) {
return &p.($.atom)
}
Perl::Rx::Meta::generate_match (&p, $limit) {
return join '', $.possible
}
Perl::Rx::Oneof::generate_match (&p, $limit) {
return join '', $.possible
}
Perl::Rx::Charclass::generate_match (&p, $limit) {
return join '', $.possible
}
Perl::Rx::Sequence::generate_match (&p, $limit) {
my $string;
$string ~= &p.($_) for $.atoms;
return $string;
}
Perl::Rx::Alternation::generate_match (&p, $limit) {
my $string;
$string ~= &p.($_) for $.atoms;
return $string;
}
Perl::Rx::Modifier::generate_match (&p, $limit) {
my $string;
$string ~= &p.($_) for $.atoms;
# is $self ($.) still the topic here? or is the last
# member of $.atoms?
return $self.mod.transform($string);
}
Perl::Rx::Modifier::repeat (&p, $limit) {
$string := join '', map { join '', $_ }
permutations (split //, &p.($.atom)) xx ($.max // $limit);
return $string;
}
Joseph F. Ryan
ryan...@osu.edu
Ah modesty. I'll re-read RFC93/A5 and summarize my understanding:
> 093 abb Regex: Support for incremental pattern matching
Problem=binding a pattern to a file or stream. Problem accepted.
Looks like Larry says, not just stream, but any array, possibly an
infinite one at that, and discusses stringifiable arrays. Which leads
to positing an ArrayString tie, which leads to patterns matching
against arbitrary stringifiable arrays- which in turn can represent
open filehandles or sockets.
So back to sqaure one. A pattern has to act on a string at some level.
A5 shows a cool way to stringify an array. Arcadi showed a cool way to
use propterties on a dummy string, so a pattern can use assertions on
object-properties of the characters of that dummy string. Combining
that trick with a variation on the tie would make what I envision
transparent. Still, when looking for a pattern among a million sound
frames, I don't want to diddle a property on a char 1E6 times over.
Can't say @sounds =~ /pattern/, because that will stringify each
element of @sounds, finding which stringifications match. Can't use
hashes, they're taken too.
How about binding a sub to the pattern? A sub can feed arbitrary objects
to the pattern. The pattern can eat them with custom assertions, and
maybe also the dot, since the dot matches one of anything.
This suggestion looks like RFC 93, but it isn't, except for the
special case where the sub returns characters. Sorry I mentioned the
RFC in the first place- taking that out of the picture- what do you
think?
-y
~~~~~
The Moon is Waxing Crescent (38% of Full)
You're confusing
@sounds ~~ /foo/
With
$foo ~~ /@sounds/
Which are quite different. The behavior of the former that you
described would be written:
any(@sounds) ~~ /foo/
And the real semantics would (theoretically) be the stringified array
behavior of RFC 93 in A5.
> How about binding a sub to the pattern? A sub can feed arbitrary objects
> to the pattern. The pattern can eat them with custom assertions, and
> maybe also the dot, since the dot matches one of anything.
Sure, that's no problem. It's just a bad solution compared to the
array, because the sub has to maintain state. But it is doable with
(something like):
class MySubArray is Array {
has &.code;
method FETCH($index) {
return .code();
}
}
But you'd probably get more use out of a coroutine. And that you
could wrap up in an array for the regex engine like so:
sub somecoroutine() {...}
my @things := <somecoroutine()>;
@things ~~ /.../
Laziness makes it all work out.
Perl 6 has the ability, with ties, to lie about what's really driving
things. And since binding is default for sub calls, and because tying
will presumably be so much easier, I presume that this kind of act
will be much more common than in Perl 5. (But a future version of
Perl6::Classes will make it just as easy in Perl 5 :)
Lazy arrays fit the regex bill nicely, because it only has to
stringify things once, even when it's backtracking over array element
boundaries.
Luke
Yes, arrays are the way to go. I had a change of heart overnight-
binding against a sub directly is ugly (at least for any purpose I
can think of).
I still don't have a syntax for what I want.
Here's "current" perl6, for finding if there are two adjacent color objs:
use Colorific qw(Blue BlazingWhite);
rule color { (.) { let $0 := $1.color } }
rule same_color($color is Colorific)
{
<color> ::: { fail unless $1.looks_like($color); }
}
my Colorific @skysamples = peek_with_some_hardware;
my $sky_string = join map { "1" but color($_) } @skysamples ;
$clear_day = $sky_string =~ /<same_color(Blue)><same_color(BlazingWhite)>/;
What I want is something like:
use Colorific qw(Blue BlazingWhite);
rule same_color($color is Colorific)
{
(.) { fail unless $1.looks_like($color); }
}
my Colorific @skysamples = peek_with_some_hardware;
my $clear_day = @skysamples =~
m:sequential_objects/<same_color(Blue)><same_color(BlazingWhite)>/;
In this hypothetical example, there's a "sequential_objects" modifier
that treats "@skysamples" as a sequential list of objects, instead of
a list of independent strings to grep through. It also tells the "."
not to stringify its input. Then the "color" rule isn't needed to unpack
the property of the dummy char anymore. And, the whole thing is cleaner!
Is a modifier the way to express this? Or an adverb? Something else?
Not tying to a string, since that requires stringifying the list. I don't
want to deal with characters.
What if I want to check which days were sunny? Can this work?
# already read list of Colorific objects for each day's sky
@WorkWeek=[@MonSky, @TuesSky, @WedsSky, @ThursSky, @FriSky];
@ClearDays = @WorkWeek =~
m:sequential_objects/<same_color(Blue)><same_color(BlazingWhite)>/;
should be the same as
@ClearDays = @WorkWeek.grep(
m:sequential_objects/<same_color(Blue)><same_color(BlazingWhite)>/);
Thanks for bearing with me. Learning how to be clear and also learning
more perl...
-y
~~~~~
The Moon is Waxing Crescent (47% of Full)
I think this problem is isomorphic with the "generic equality"
problem. There's no "generic match" really. You have to decide what
things you want to match. Are you matching class names? Are you
matching specific object attributes? Are you matching object value?
Are you matching string representations? Et cetera.
So, you need at least two things: a way to specify your atoms are
objects and a way to say what about those objects you are matching. You
have the first of these quite well with your :sequential_objects
modifier (though I'd probably call it :uo or maybe :obj), but your
example is lacking good support for the second I think. It seems like
you'd be doing a lot of dot matching in order to get the object to
select the matching criteria.
What *I'd* like to write instead of your example above, is something
like this:
use Colorific qw(Blue BlazingWhite);
my Colorific @skysamples = peek_with_some_hardware;
my $clear_day = @skysamples ~~
m:obj/<.looks_like(Blue)><.looks_like(BlazingWhite)>/;
or maybe this if looks_like could be an is-a relationship:
my $clear_day = @skysamples ~~ m:obj/<Blue><BlazingWhite>/;
or maybe this:
my $clear_day = @skysamples ~~ m:obj(looks_like)/<Blue><BlazingWhite>/;
Where once the parser knows it's dealing with "objects", it can
Huffmanly decide what "match" means or be explicitly told what "match"
means and each atom becomes the topic for the current assertion.
I like the first of my examples above because then it's easy to vary
the idea of matchiness on a per-assertion basis. Zero or more samples
that looks_like Blue, followed by a sample that contains a cloud could
be written something like this:
my $clear_day = @skysamples ~~
m:obj/<.looks_like(Blue)>* <.contains(Cloud)>/;
caveat lector, I'm just muddling through :-)
-Scott
--
Jonathan Scott Duff
du...@cbi.tamucc.edu
So, maybe I'm off-base, and please feel free to ignore me....
But I think I'd rather see something like this:
use Colorific qw(Blue BlazingWhite);
my Colorific @skysamples = peek_with_some_hardware;
my $clear_day ||= ($_ ~~ /<Blue><BlazingWhite>/) for @skysamples;
The whole idea of smart matching on sets hasn't quite settled into a
comfy spot in my brain yet.
__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com
Except that would do something different. The original code was
meant to match an array element to with $obj.looks_like(Blue) and then the
immediate successor to $obj.looks_like(BlazingWhite). Yours attemps to
match those two things to a single element of the array.
> The whole idea of smart matching on sets hasn't quite settled into a
> comfy spot in my brain yet.
It will probably feel that way until there's a completely working perl
6 out there to try your ideas upon. :-)
This ties into something I was thinking while pondering the regex
engine. Indeed we have "parse objects" (though we don't quite know
how they behave), and I think we also need "parser objects." And I
think the regex core needs to support them.
As I see it, Perl 6 will prove the most powerful parsing tool
since... well... never. And Perl 6 has a very nice syntax for
grammars, and patterns in general.
But, parsing problems have trade-offs for expressiveness, power, and
speed. And a recursive descent is one of the more powerful, but also
quite slow. And, there is no catch-all best algorithm for parsing.
So, instead of forcing a recursive descent on people, they should be
allowed to choose with of several algorithms they want. And it should
be easy to write parser back-ends to the regex syntax. Easier than
walking the parse tree (even though we don't yet know how easy that
would be). And the default would probably stay recursive-descent for
its power.
So, if one was doing complex parsing, one could C<use Parser::Earley>
(naturally written by me this time around, too :). If one was doing
paring on large grammers with tokenizable input, one could C<use
Parser::LALR>. And then Perl 6's regex engine is still the fastest on
the market, because the programmer can choose which variant is the
fastest for his/her specific application.
So, how does this relate to the discussion at hand? Well, maybe
parsers are implemented policy-wise. Then you could replace the
default character atom with a Colorific atom.
my $colorific_parser = new Parser::RecursiveDescent(
atom => Colorific );
$colorific_parser.parse(@skysamples, /./); # Matches one Colorific
The method by which is specified probably has to be more powerful than
that.
The motivation for this kind of generalization is that pattern
matching applies to a lot more than matching sequences of characters.
C<Parser::LALR> needs tokens, possibly tokenized from
C<Parser::DFA>. Or in the more abstract sense, patterns can be applied
to multimethod matching, as a custom dispatcher. And then classes
need be our atoms.
I hope this whet somebody's appetite for abstract thinking. Patterns
are just too powerful for text.
Luke
Luke
>What *I'd* like to write instead of your example above, is something
>like this:
>
> use Colorific qw(Blue BlazingWhite);
>
> my Colorific @skysamples = peek_with_some_hardware;
>
> my $clear_day = @skysamples ~~
> m:obj/<.looks_like(Blue)><.looks_like(BlazingWhite)>/;
>
>or maybe this if looks_like could be an is-a relationship:
>
> my $clear_day = @skysamples ~~ m:obj/<Blue><BlazingWhite>/;
>
>or maybe this:
>
> my $clear_day = @skysamples ~~ m:obj(looks_like)/<Blue><BlazingWhite>/;
>
>Where once the parser knows it's dealing with "objects", it can
>Huffmanly decide what "match" means or be explicitly told what "match"
>means and each atom becomes the topic for the current assertion.
>
>I like the first of my examples above because then it's easy to vary
>the idea of matchiness on a per-assertion basis. Zero or more samples
>that looks_like Blue, followed by a sample that contains a cloud could
>be written something like this:
>
> my $clear_day = @skysamples ~~
> m:obj/<.looks_like(Blue)>* <.contains(Cloud)>/;
>
>caveat lector, I'm just muddling through :-)
>
I may just be leaping off a limb here, but...
Suppose that all regex/rule objects have a C<pattern> method that is
used during compilition to access the regex/rule. Normally, this
method would be called transparently by the =~ and ~~ whenever they
needed to peform a match. This behaivor wouldn't change anything
about how regex/rules would work. However, this would allow classes
to define their own C<pattern> method; kind of like defining their
own stringification method. This would allow us to do:
class Colorific {
my %colors = (
blue => '1',
cloud => '2',
flame => '3',
...
)
method pattern {
my $string;
$string ~= type($_) foreach @.colors;
return $string;
}
sub type ($c) {
return %colors{$c}
}
sub Blue is exported_ok { # or however exporter will work
return $colors{blue}
}
sub Cloud is exported_ok {
return $colors{cloud}
}
sub Flame is exported_ok {
return $colors{flame}
}
...
}
And then you could do:
use Colorific qw(Blue Cloud);
my Colorific @skysamples = peek_with_some_hardware;
my $clear_day = @skysamples ~~ /<Blue>* <Cloud>/;
Which ain't too bad, in my book. (-:
Joseph F. Ryan
ryan...@osu.edu
Sure--that's part of why we're treating Grammars as Classes that
(to some extent) can be compiled before use. Though perhaps it's a
fundamental mistake to let the class choose the exact type over which
it operates, since that precludes generic code.
: I hope this whet somebody's appetite for abstract thinking. Patterns
: are just too powerful for text.
Well, that depends on how one defines "text", but by your definition,
I agree. Though the danger of treating patterns as too powerful for just
text is that we run the risk of making patterns too powerful for text.
However lofty we aspire to get, we gotta keep our feet on the ground.
Larry
> However lofty we aspire to get, we gotta keep our feet on the ground.
This thread could be moot, seeing how shape-shifting perl6 is shaping
up to be. F'rinstance- If the pattern-engine can be subclassed and
its internal ops overridden- bwaa-haaa-haa-ha! The dot is mine!
-y
~~~~~
The Moon is Waxing Crescent (49% of Full)
You are assuming that the tied string is a string of characters.
I don't see why it couldn't be a string of objects, with a string of
characters merely being a special case. So you could conceivably
intermix characters and objects in a string even.
And there's no reason for a full fledged tie. Merely saying
@array.seq ~~ /<foo><bar><baz>/
or
~@array ~~ /<foo><bar><baz>/
might be sufficient clue to the regex, and given that the "stringification"
can happen lazily, might never actually produce a real string.
On the other hand, it's possible that the smart matching table is
screwed up, and that it's a mistake to assume that
$foo ~~ (1,2,3)
means
$foo ~~ (1|2|3)
That can be construed as throwing away sequential information needlessly.
Likewise,
@array ~~ /<foo><bar><baz>/
should perhaps be written
any(@array) ~~ /<foo><bar><baz>/
if we mean to throw away the ordering of the array, and just
@array ~~ /<foo><bar><baz>/
if we want to match the elements sequentially, regardless of whether
they're matched as characters or as objects.
On the gripping hand, it's hard to know which view would have more
failure modes without trying it both ways.
In any event, it could depend on the type of either the array or the
regex, since as has been pointed out elsewhere, smart matching is like
the generic equality problem. But the thing that bothers me about
that is the notion of mixtures of text and objects. HTML and XML could
be viewed as objects embedded in text, and text embedded in objects.
Making the ~~ switch on argument types doesn't give fine enough control.
Putting the fine control inside the regex seems like it's too late,
if @array has already been interpreted as an any().
If we did break the auto-any of arrays, then for array intersection,
instead of this:
@foo ~~ @bar
we'd have to write this:
any(@foo) ~~ any(@bar)
I suppose that'd mean that
@foo ~~ @bar
would mean the two arrays are the same sequences of values. This may
increase or decrease the power of
given @array {...}
and
given %hash {...}
Have to think about that a lot more. It's easy enough to say
any(@foo) ~~ any(@bar)
if you want set intersection, but you're given an array, writing
it as a case now becomes
given @foo {
...
when .any ~~ @bar.any {...}
}
Perhaps that's rare enough not to be a problem.
There are other ramifications. If we allow
given @foo {
when [1,2,3] {...}
}
to to list matching, then people are going to want wildcards:
given @foo {
when [] {...}
when [1] {...}
when [1,2] {...}
when [1,2,3] {...}
when [1,2,3,*] {...}
}
But that's obviously too simpleminded, when the list is functioning
as a pattern. So we need something like
given @foo {
when [1,2,3,<elem>+] {...}
}
And now, of course, you want to capture what the <elem>+ matched, so we have
given @foo {
when [1,2,3,(<elem>+)] { print @1 }
}
Urgh. It's not clear that we want to reinvent regex syntax for lists.
On the other hand, it's not clear that we don't....
It's possible that we just write that with elements that are regexes:
given @foo {
when [1,2,3,/<elem>+/] { print @1 }
}
Of course, given the symmetry of ~~, @foo could also contain elements
that are object regexes. So if there's a collistion of regexes,
we have to figure out if they can match each other in some fashion?
Pardon me while my brain hurts...
Larry
aH, ok. (I copied the /<Blue><BlazingWhite>/ from later in the message,
but probably just didn't read it closely enough.) :o/
> > The whole idea of smart matching on sets hasn't quite settled into
> > a comfy spot in my brain yet.
>
> It will probably feel that way until there's a completely working
> perl 6 out there to try your ideas upon. :-)
Very true.
I agree completely. But I think this is an area where hard things can
be possible.
The default rule engine should be optimized for homogeneous strings
composed of characters as that's the common case. But if we can tell
perl that our "string" is heterogeneous or that it's composed of object
"characters" then it's free to use a different rule engine that
doesn't make those assumptions and we've just made some nifty things
possible.
That is pretty nifty, but how would you say something like "match 2 blue
things, then a thing that isa Bird, then something that can fly"? Your
way puts all of the logic in the pattern method when it's probably
better in the rule itself, especially if the stream of objects we're
matching against is heterogeneous:
@things ~~ /<.color(Blue)><2><.isa(Bird)><.can(fly)>/;
versus
@things ~~ /<Blue><2><Bird><fly>/;
Imagine that @things contains 2 X objects and 2 Y objects that would
cause the match to succeed. The pattern method for the Y object would
have to somehow handle both the .isa and .can ideas. How would it
know which to apply when?
A "Colorific atom" doesn't make sense to me. It seems to conflate the
thing you want to match against with the criteria you've chosen for
matching. It's equivalent to saying
my $p = new Parser::RecursiveDescent( atom => "a" );
$p.parse(@things, /./); # matches one "a"
Atoms are things to which you can apply a matching criterion. "Is it
Colorific?" sounds like a matching criterion and "it" sounds like an
object, so your atoms would be objects, not Colorifics.
> I hope this whet somebody's appetite for abstract thinking. Patterns
> are just too powerful for text.
Nah, just first pass your object stream through a routine that maps
each object to a string representation of the thing you want to match
against on each object, then apply regular text rules to the string.
Grammar WallStreet;
rule day
{
/./ # Expects database stock-price data. See class Ticker.
}
rule trend(&xer_than)
{
<day> { $0.min = $1.min; $0.max = $1.max; }
(
<day> <( &xer_than($1.min, $0.min)
&& &xer_than($1.max, $0.max) )>
{ $0.max = $1.max; }
)+
}
my $up_trend = rx/<trend(infix:>)/;
my $down_trend = rx/<trend(infix:<)/;
my $buy_signal = rx/<$down_trend><$up_trend> <( $2.length == 1 )>/;
my $sell_signal = rx/<$up_trend><$down_trend> <( $2.length == 1)>/;
broker()
{
my $wire = new Stock::Ticker('NYSE');
given $wire
{
when <$buy_signal> { buy(); }
when <$sell_signal> { sell(); }
}
}
Nifty is one way to put it...
=Austin
rule nifty()
{
(/./ <(.color == "Blue")>)<2>
/./ <(.isa(Bird))>
/./ <(.can(Fly))>
}
> Your
> way puts all of the logic in the pattern method when it's probably
> better in the rule itself, especially if the stream of objects we're
> matching against is heterogeneous:
>
> @things ~~ /<.color(Blue)><2><.isa(Bird)><.can(fly)>/;
>
> versus
>
> @things ~~ /<Blue><2><Bird><fly>/;
>
> Imagine that @things contains 2 X objects and 2 Y objects that would
> cause the match to succeed. The pattern method for the Y object would
> have to somehow handle both the .isa and .can ideas. How would it
> know which to apply when?
Presumably that's up to the match engine. There's an old text searching
algorithm that used to use an inverted frequency table to find the
"least common characters" in the query string, then search for them in
the body of text. If it found them, it would back up the appropriate
number of places and try a match, if no match, it would move forward a
bit and try again.
That behavior is up to the parse engine. Remember that throwing an
exception will cause a backtrack, so if the Blue objects don't know how
to handle a .can(Fly) request, they'll "die" or the dispatcher will,
and the rx engine will backtrack and try a different configuration.
=Austin
No, they're Colorifics (Colorific was a class, unless I
misremembered). Regardless of whether the specific colors are
subclasses, or just real Colorifics with properties, they differ from
one to another, so they can be atomic.
You're correct, THIS doesn't make sense:
my $blue = new Colorific;
my $p = new Parser::RecursiveDescent( atom => $blue );
But there's no difference between matching Objects and matching
Colorifics except for some extra type safety if it can be provided.
But, as Larry said, patterns can't be generalized beyond being useful
and convenient for text (which reminded me once again why he's the one
in charge of all the decisions). Stringifying them and associating
those characters with the object itself seems like it would
work... but I'm beginning to see major problems with that. I still be
thinkin'....
Luke
Um ... so what happens when there's a non-Colorific in the stream you
are matching against? Does the parser croak? Does it just skip it? Does
it try to transmogrify it into a Colorific? What happens?
This is why I say that "Colorific atoms" don't make sense. If we have
atoms of arbitrary objects, we can use any of the objects' attributes to
match against. Colorific is just an attribute of an object. That
object may be a class, or an instance or anything.
What happens depends on the object behavior. If the object acts like a
Colorific, and conforms to the rules of the grammar, then the grammar
happens.
If an exception is thrown, as for example when a method invocation
fails because the object doesn't support that method, or because a
function was called with an arg that doesn't match the signature, the
pattern will fail. This is documented in A5.
>
> This is why I say that "Colorific atoms" don't make sense. If we have
> atoms of arbitrary objects, we can use any of the objects' attributes
> to
> match against. Colorific is just an attribute of an object. That
> object may be a class, or an instance or anything.
Sure. But if I write my patterns expecting a series of Colorific
objects, that's MY problem. Just like if you pass \u2105\u2276 to a
pattern like /[0-9]+/ -- it fails. That's YOUR problem -- your data and
your pattern didn't agree: fix your code.
I think the point is that if I've invested the MIPS in generating a
massive array of Colorific objects, then I should be able to tell the
compiler, "This is an array of Colorific objects. Do pattern matching
using that knowledge" and reap the benefits of my foresight.
Really, this is just the same argument as signatures: If I adhere to
some basic constraints, I should be able to tell the compiler about the
constraints, and benefit from my own self-discipline.
=Austin
Ah, you're right, I just had blinders on there for a little bit.