Thanks,
/Autrijus/
Not unless you want to write the Halting engine that determines that 3
is in fact more specific that 2..10. It's based on definition order,
so that if you have dependencies in you condition (which you
oughtn't), you'd better define the multis together to get well-defined
semantics.
> > The upshot is that these are now errors:
> >
> > sub foo ($x) is rw { $x }
> > my $a;
> > foo($a) = 4; # runtime error - assign to constant
>
> I assumed lvalue subs would implicitly return void and an
> assignment goes to the function slot of the args used in the assignment
> and subsequent calls with these args return exactly this value.
> In that respect arrays and hashes are the prime examples of lvalue
> subs. Other uses are interpolated data, Delauny Triangulation etc.
Well, in the absence of optimization, what's usually going on is that
the lvalue sub is returning a tied proxy object, which you then call
STORE on.
Luke
> Not unless you want to write the Halting engine that determines that 3
> is in fact more specific that 2..10. It's based on definition order,
> so that if you have dependencies in you condition (which you
> oughtn't), you'd better define the multis together to get well-defined
> semantics.
That seriously sucks.
Multis rock because they let you append to an interface from your
perspective.
If it's just a pretty form of casing, then we aren't gaining
anything, IMHO.
--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &
/\ kung foo master: /me groks YAML like the grasshopper: neeyah!!!!!!
http://svn.openfoundry.org/pugs/docs/mmd_match_order.txt
Values may be compiled into where clauses which are eventually just
a big given/when behind the scenes, but the order in which they are
checked must be integrated with type checking, and must be sorted to
make sense.
If you cannot define a more particular case of a method in order to
optimize for:
* speed
* simplicity
then you lose on a lot of what makes MMD a useful tool for post-oop.
In code which was preplanned for MMD, that was properly ordered, mmd
is useful as a subset of it's behavior - it's just pattern matching.
This is nice, but has none of the extensibility that MMD can offer
if done differently.
/\ kung foo master: /me sneaks up from another MIME part: neeyah!!!!!
He meant:
http://svn.openfoundry.org/pugs/docs/notes/mmd_match_order.txt
Luke
> > the one defined LATER in the file wins
That should read
"the one defined in the LATER file wins"
=)
> If we're going to make a choice for the user (something we usually
> avoid), we might as well go with the one that I would pick :-)
Blah blah blah, write a pragma, blah blah blah.
I tend to agree on generic -> specific, but if it is to be read like
given/when in a way, which arguably it is behind the scenes, then
maybe we should make given { } take the statements in a block and
execute them from last to first?
> I like the idea of your tree of match order, I just don't like the
> tree itself too much.
It isn't a tree... see below
> If we're going to reorder things for the user,
> it does need to happen in a predictable way, even if it's not correct
> 100% of the time. I find your tree to be pretty complex (that could
> be because I don't understand the reasoning for the ordering
> decisions). I'd prefer something more like:
>
> 1. Constants
> 2. Junctions / Ranges
> 3. Regexes
> 4. Codeblocks
This is pretty match the same as what I proposed...
The sub points are usually clarifications, not a tree.... Did you
actually read it?
It discusses types, roles, inheritence, and so forth, as well as
measuring the "specifity" of junctions of values and types. It's
long because it goes into detail.
> Where none of them is recursively decended into for matching. That
> particular order has no special significance, it just felt natural.
> I'm just pointing out that it should be simple[1].
I agree with simplicity.
Please read the sequence i proposed - it's trying to define with
great detail the simplest rules I can think of.
> it is only simple and predictable when you have the whole class
> heirarchy in your head.
That's why under the fourth steps I detailed that MI confusions are
a fatal error, possibly at compile time.
--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &
/\ kung foo master: *shu*rik*en*sh*u*rik*en*s*hur*i*ke*n*: neeyah!!!!
Hmm. I wonder if we should just make the later ones win in all cases.
Generally when I structure code, I find it most natural to go from
general to specific. If we're going to make a choice for the user
(something we usually avoid), we might as well go with the one that I
would pick :-)
I like the idea of your tree of match order, I just don't like the
tree itself too much. If we're going to reorder things for the user,
it does need to happen in a predictable way, even if it's not correct
100% of the time. I find your tree to be pretty complex (that could
be because I don't understand the reasoning for the ordering
decisions). I'd prefer something more like:
1. Constants
2. Junctions / Ranges
3. Regexes
4. Codeblocks
Where none of them is recursively decended into for matching. That
particular order has no special significance, it just felt natural.
I'm just pointing out that it should be simple[1].
Still, I very much agree with your desire to be able to extend someone
else's interface, which we can solve by messing with the tiebreaking
order.
Luke
[1] That is also my complaint about the Manhattan metric for
multimethod resolution: it is only simple and predictable when you
--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &
/\ kung foo master: /me tips over a cow: neeyah!!!!!!!!!!!!!!!!!!!!!!
I suppose I was mostly commenting on the junctions part. I'm
proposing that All Junctions Are Created Equal. That is, there is no
specificity measuring on junctions. I also didn't really understand
your right-angle-tree-ratio measure. Does it have a name, and is
there a mathematical reason that you chose it?
Anyway, I think that once we start diving inside expressions to
measure their specificity, we've gotten too complex to be predictable.
Luke
Eek! no.
I think guards (our where closures which I call where clauses) are
enough... =)
If you want to optimize simple where clauses by introspecting their
PIL, that's a different story =)
--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &
It's really nothing more then a metaphor.
method bark ((Dog | Sheep) $inv:) {
}
method bark ((Retreiver & Sheppherd) $inv:) {
}
When I dispatch a 'bark' method on my $uber_dog, I'd like it not to
be a sheep. The way this is weigted is basically '&' is more
particular than '|'.
The tree dimentions thing deals with nested junctions. Basically you
"draw" out the tree on some 2d space, where the root is (0,0), &
combinations are drawn on the Y axis, and | combinations are drawn
on the X axis.
Parent axes are stretched to fit their children's drawings.
Here are two examples:
((Dog & Retriever) | (Sheep & Stupid))
Dog Sheep
| |
+------x------+
| |
Retriever Stupid
((Dog | Sheep) & (Retriever | Stupid))
Dog-----|-----Sheep
|
x
|
Retriever--|-----Stupid
By graphically spanning the tree you can see if it tends to be wide
(ORish) or deep (ANDish). The score of the junction is the ratio of
the length on the x axis, over the length on the y axis.
The bigger the combinator is (the | in the first example, the
& in the second one), the bigger the line in the picture will be,
because the boxes under the lines must fit in the spaces (hence the
right angles bit) between the combinator's line and it's siblings'.
This only showes up clearly in 3 level structures and up:
((Dog & (Retriever | Sheppard)) | Sheep
Dog
|
Sheep------------x------------|
|
|
Retriever--|--Sheppard
((Dog & (Retriever & Sheppard)) | Sheep
Dog
|
Sheep------x-------|
|
_+_
Retriever
|
|
|
Sheppard
(assume for fairness that strings are a block, 1x1, and each line is
1 block thick)..
Of course, you don't need to do that - you just do depth first
traversal, calculate the junction's "dimentions", and push upward.
The eventual junction will either be 'ANDish' or 'ORish', and and
the tendancy of the junctions is their order.
If there are candidates which are too close together for the same
parameter under a single shortname, the user should be warned.
For example, in the stupid sheep/retrieving dog example, you don't
want both definitions, but (Dog & Retriever) is definately more
specific than (Sheep | Dog).
We discussed this a bit on #perl6, and we concluded that my tree
example was stupid. If you have a better explanation, please commit
it to the pugs repo, even if you don't agree with this - just for
the sake of clarity (because I can't explain it any better).
> Anyway, I think that once we start diving inside expressions to
> measure their specificity, we've gotten too complex to be predictable.
I think that may be right, but just for junctions it's very
tempting.
--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &
> http://www.cs.washington.edu/research/projects/cecil/www/Papers/predicate-classes.html
Regardless of MMD, I think this is an interesting concept on it's
own.
classe Moosish does pred:where {
... # a where clause
} {
# class def
}
Does this mean that conflicting signatures assure that only one
'where' clause passes?
the pred trait accepts a higher order type as it's arg, and just
merges it with it's methods' types by hooking the metamodel's
'add_method' method, or whatever it's called (stevan?).
I'm not sure I know how to oppertunistically 'staticize' this,
though.
Interesting paper, although admittedly I only skimmed it.
--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &
/\ kung foo master: /me dodges cabbages like macalypse log N: neeyah!
I'll be glad to work on it, yes, and thanks for sending it.
I would definitely appreciate any help that other p6l folks can provide
in putting these into an appropriate form for the Synopses.
Pm
I would like to point out that for mere mortals, *any* MMD is already too
complex to be predictable. Some people can't even predict SMD. :-/
Regardless of the MMD policy (or range of policies) we allow/enforce,
I think we need to consider what the naive user is to do in the face
of the (to them) black box of MMD. I wonder if there's some way to
annotate a call to say exactly which routine you expect it to call, and
then if MMD dispatches elsewhere, you get some kind of a warning that
tells you exactly why it chose the other routine over your routine.
It doesn't have to dump the whole MMD decision tree on them, but
merely say something like "Argument $foo picked a constrained type
over an unconstrained type". Or "Argument $bark picked Dog with
distance 1 over Mammal with distance 2". Or "Argument $bark picked
'Dog where...' with distance 1-ε over Dog with distance 1".
Unless we can feed more specific information to the naive user
in an easily digestible form, I'm still inclined to say that *any*
constraint just subtracts "epsilon" from the distance, and if you don't
write your constraints to be mutually exclusive within a single file,
and you depend on the dispatch order to distinguish, it's erroneous.
(We could still subtract addtional values of epsilon for later files
to make Yuval happy--or at least less unhappy...)
Actually, a naive user probably doesn't even want to see the epsilons.
We could go as far as to make it:
Argument $bark picked Dog where... with distance 0.99
over Dog with distance 1
Then Yuval's overriding files can be distance 0.98, 0.97, 0.96, etc.
An epsilon of .01 should be small enough for anyone. (Or at least
any engineer.)
The warner should also detect ambiguities in constraints if we make all
contraints in a file the same epsilon. I just showed the warning for
a single argument, but it should probably tell the distance on all
the arguments that differ, and maybe even calculate the overall distance
for them.
Of course, if we make the MMD rules sufficiently complicated, we'll
just have to make the warning spit out a spreadsheet to show the
calculations. Then we hide all that behind an interview process,
just like all our wonderful tax preparation software...
Larry
Epsilons are a bit like handwaiving, in the sense that it's not as
clear cut.
I'd rather they be unexposed for the most part (see below for
exception), but that in general different sets of contraints are in
a total different ballpark of weighting.
I think any ambiguity that is not explicitly resolved from the
caller (by means of disambiguation syntax that is like the
annotation syntax, whatever it may be) should be an error
(preferably compile time, if the type of the value we're dispatching
on is inferred or declared).
> (We could still subtract addtional values of epsilon for later files
> to make Yuval happy--or at least less unhappy...)
;-)
I actually think that definition order is now irrelevant, with the
exact semantics I proposed.
Rob Kinyon had a strong argument (in #perl6) that anything that
depends on load order is bound to make someone's head hurt.
He has a point.
> Then Yuval's overriding files can be distance 0.98, 0.97, 0.96, etc.
> An epsilon of .01 should be small enough for anyone. (Or at least
> any engineer.)
Hmm... that's a useful hack, but I don't think it's much more than a
last-resort type hack. Either way, I will always place line long
comments on exactly why i'm adding that value. I'd rather have my
code document itself, much like functional pattern matching.
> Of course, if we make the MMD rules sufficiently complicated, we'll
> just have to make the warning spit out a spreadsheet to show the
> calculations. Then we hide all that behind an interview process,
> just like all our wonderful tax preparation software...
I think MMD's weakenss in this respect is that for it to be
intuitively useful, it needs just the right amount of complexity.
You don't want to think about numerical ranking, or placement in a
tree of loaded modules which you need to read 10 files to find out
(or maybe that you can't find out at all (perhaps a parrot
disassembler helps in some cases)), and you don't want to think
about rules of a complicated system either.
you'd like, ideally, for 95% of the cases to be intuitive as far as
writing goes (writing either being taking care of functionality or
using it). When writing you usually know what values you're dealing
with (or at least you're resolving to know later). In that case, if
you can make simple rules with some perl that you know, and they are
intuitively ranging from specific to generic in their constraint,
you should be happy. For when you're confused or forgetful and the
system isn't DWIM enough, i guess %4.5 can be dealt with using
errors.
I don't know of a method to take care of %0.5 elegantly, but I think
that encouraging MMD to not be used in order to lower the actual
number that %0.5 is is a bigger mistake.
--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &
/\ kung foo master: MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM: neeyah!
It's not clear to me why we shouldn't *also* have parse-tree level that is
taken to be consistent with whatever the current language is, as long
as we retain enough information to deparse it into the current language,
or to compile down to PIL.
: * The "Hash", "Int", "Str" etc are just roles:
: role Hash[?::returns=Any, ?::shape=Str] {
: }
: implementation classes are known as "PerlHash", "PerlInt" etc.
I thought part of the reason for allowing roles to act as classes via
an anonymous class is to allow both the role and the class in question
to be referred to via name "Hash".
: * Filehandles opens chomped by default; you need to add the `:unchomped` flag
: to turn off chumping.
s/unchomped/newlines/
: my $fh = open 'file'; # autochomp
: my $fh = open 'file', :newlines; # non-autochomp
: $fh.newline = '\r'; # now split on \r
: $str = $fh.readline;
: $str.newline; # \r
:
: for chomp =$fh { ... }
:
though perhaps it would be less confusing with :newline if we renamed
:unchomped to :savenl instead. Or maybe change 'newline' to 'nl'.
In that case, we have to differentiate them a little better than with
just a trailing 's', since open can take either of them. I think I
vote for :newlines and :nl("\r"). (Along with .newlines as boolean
and .nl as string/rule attributes on the handle, and a .nl string/match
attribute on the input string.)
: If .newline is a rule, then its captured variables are made available to the
: calling block as if it has done a matching.
More precisely, If $fh.nl is a rule, then $str = =$fh sets $/. The match
object is also returned as $str.nl.
: * `&prefix:<int>` now always mean the same thing as `&int`. In the symbol table
: it's all stored in the "prefix" category; &int is just a short name way for
: looking it up -- it's just sugar, so you can't rebind it differently.
Might end up actually storing the short form of those and inferring the
'prefix' as needed. Dunno.
: * Constrained types in MMD position, as well as value-based MMDs, are _not_
: resolved in the type-distance phase, but compile into a huge given/when
: loop that accepts the first alternative. So this:
:
: multi sub foo (3) { ... }
: multi sub foo (2..10) { ... }
:
: really means:
:
: multi sub foo ($x where { $_ ~~ 3 }) { ... }
: multi sub foo ($x where { $_ ~~ 2..10 }) { ... }
:
: which compiles two different long names:
:
: # use introspection to get the constraints
: &foo<ANONTYPE_1>
: &foo<ANONTYPE_2>
:
: which really means this, which occurs after the type-based MMD tiebreaking
: phase:
:
: given $x {
: when 3 { &foo<ANONTYPE_1>.goto }
: when 2..10 { &foo<ANONTYPE_2>.goto }
: }
:
: in the type-based phase, any duplicates in MMD is rejected as ambiguous; but
: in the value-based phase, the first conforming one wins.
See subsequent discussion, including the part that hasn't happened yet. :-)
Larry
>Rob Kinyon had a strong argument (in #perl6) that anything that
>depends on load order is bound to make someone's head hurt.
>
>He has a point.
>
>
Especially if one in working in something like mod_perl, and the order
various modules were actually loaded in can vary greatly from the order
they are listed in the source code.
Unless we have every lexical scope keep track of what order *it* thinks
all the MMD methods *should* have been loaded in, which overall feels
very painful.
I thought I've had is whether there should be a "subname" that can be
defined on a given multi, to identify it as distinct from the others,
and not having to type the full signature. Something analougous to
HTTP/HTML # suffixes. One could then use that subname in conjuction with
the short name to refer to a specific method. This could then let a user
easily skip MMD when DWIMmery fails. To be useful, it would need to be
simple syntax. I'll propose forcing "# as comment" to be "\s+# as
comment" (if it isn't already), and have subnames specified as
shortname#subname.
multi method foo#bar (Num x) {...}
multi method foo#fiz (String x) {...}
$y = 42;
$obj.foo#fiz($y); # even though $y looks like a Num
$obj.foo($z); # let MMD sort it out.
It's unclear if
$obj.foo<String>($y);
even works, or should work, even if it does.
It be no means solves all of Yuval's problems, but it would be a handy
workaround to un-multi your calls.
-- Rod Adams
> multi method foo#bar (Num x) {...}
> multi method foo#fiz (String x) {...}
>
> $y = 42;
> $obj.foo#fiz($y); # even though $y looks like a Num
> $obj.foo($z); # let MMD sort it out.
>
Having additional tags might also give us something to hang priority
traits off: "foo#bar is more_specific_than(foo#baz);" might influence
the order of clauses in the implicit given/when block. It feels like
there should be a generalization of operator precidence here (even
thought he two are superficially dis-similar, the looser/tighter concept
appears valid).
I like that =)
--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &
/\ kung foo master: /me has realultimatepower.net: neeyah!!!!!!!!!!!!
Intuitively I'd say $obj.foo(String<$y>) or something like that...
$obj.foo<String> reads like MMD on the return value to me, and in
that case I'd prefer
String<$obj.foo($y)>
or maybe a type is a part of the context? Then we can use C casting
syntax, and it'll actually make sense.
(where { ... })$value
;-)
--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &
/\ kung foo master: /me supports the ASCII Ribbon Campaign: neeyah!!!
On Jul 8, 2005, at 4:25 PM, Dave Whipp wrote:
> Rod Adams wrote:
>
>
>> multi method foo#bar (Num x) {...}
>> multi method foo#fiz (String x) {...}
>> $y = 42;
>> $obj.foo#fiz($y); # even though $y looks like a Num
>> $obj.foo($z); # let MMD sort it out.
>>
Instead of changing the parse rules for #, why not just use a trait?
multi method foo is short_name('bar') {...}
> Having additional tags might also give us something to hang
> priority traits off: "foo#bar is more_specific_than(foo#baz);"
> might influence the order of clauses in the implicit given/when
> block. It feels like there should be a generalization of operator
> precidence here (even thought he two are superficially dis-similar,
> the looser/tighter concept appears valid).
Although I like the idea of reusing this concept, I'm not sure that
it really solves the problem. Fundamentally, we're trying to make
MMD behave intuitively with no programmer effort.
--Dks
> Could we break them out into separate threads so that our poor summarizer doesn't go
> bonkers?
See? That's what specialization/particulation is good for. Thanks
for strengthening my point!
>
> On Jul 8, 2005, at 4:25 PM, Dave Whipp wrote:
>
>> Rod Adams wrote:
>>
>>
>>> multi method foo#bar (Num x) {...}
>>> multi method foo#fiz (String x) {...}
>>> $y = 42;
>>> $obj.foo#fiz($y); # even though $y looks like a Num
>>> $obj.foo($z); # let MMD sort it out.
>>>
>
>
> Instead of changing the parse rules for #, why not just use a trait?
>
> multi method foo is short_name('bar') {...}
I thought about that, but then thought that to become commonplace it was
a bit much to type. I also couldn't come up with a way to call a given
multi that matches on a given attribute, without adding even more
complexity to MMD.
>
>> Having additional tags might also give us something to hang priority
>> traits off: "foo#bar is more_specific_than(foo#baz);" might
>> influence the order of clauses in the implicit given/when block. It
>> feels like there should be a generalization of operator precidence
>> here (even thought he two are superficially dis-similar, the
>> looser/tighter concept appears valid).
>
>
> Although I like the idea of reusing this concept, I'm not sure that
> it really solves the problem. Fundamentally, we're trying to make
> MMD behave intuitively with no programmer effort.
Well, if one views MMD as "a list of methods to try, each with it's own
requirements on it's arguments", then it can completely solve the
problem, along with a method sort function.
1) take all methods the user specified "higher than/lower than/equal to"
out of the mix.
2) sort remaining methods via a standardized function.
3) put all the ones taken out in step 1 back in, where they are requested.
4) scan the methods, in order, for the first that accepts the given
arguments.
5) dispatch to the chosen one in #4
-or-
6) begin AUTOMETHing, etc.
Then all we need is a DWIMish sort function.
Some ideas:
-- longer parameter lists go before shorter ones.
-- if param(n) of one ISA param(n) of another, it goes first.
-- slurpies after non-slurpies
-- a hashkey of the parameter types (for deterministic coin flips)
I'm not committed to what goes into the method sort function, or in what
order, just the concept of it. To me it seems easier to visualize than
distances, etc. If nothing else, it should be easy to explain to users
and programmers.
With the name tagging idea from before, one could then say things like:
multi sub foo#lastresort (*@_) is after(foo#default) {...}
for when the default sort does things incorrectly.
A reasonable extensions of this would be to have a coderef attribute
that determines if a supplied set of arguments is acceptable, rather
than the default check. This is a possible MTOWTDI for the 'where' clauses.
Then again, there are likely several glaring problems with this idea
that I'm just not seeing at the moment.
-- Rod Adams
> Then all we need is a DWIMish sort function.
Think of parameter list shape (slurpiness, arity) as a mold you can
fit stuff into.
Then it becomes a simple matter of reducing the match list to your
compatible subs.
Then see
http://svn.openfoundry.org/pugs/docs/notes/mmd_match_order.txt which
proposes a DWIMish sort function.
--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &
/\ kung foo master: /me has realultimatepower.net: neeyah!!!!!!!!!!!!
> I would like to point out that for mere mortals, *any* MMD is already too
> complex to be predictable.
This is the relevant observation here.
This particular mortal's experience is that more than four variants, involving
parameters from more than two hierarchies makes it nearly impossible to
predict all the consequences of MMD.
That's why Class::Multimethods provides coverage and ambiguity-detection
tools, which I expect Perl 6 will need too.
> Regardless of the MMD policy (or range of policies) we allow/enforce,
> I think we need to consider what the naive user is to do in the face
> of the (to them) black box of MMD. I wonder if there's some way to
> annotate a call to say exactly which routine you expect it to call, and
> then if MMD dispatches elsewhere, you get some kind of a warning that
> tells you exactly why it chose the other routine over your routine.
> It doesn't have to dump the whole MMD decision tree on them, but
> merely say something like "Argument $foo picked a constrained type
> over an unconstrained type". Or "Argument $bark picked Dog with
> distance 1 over Mammal with distance 2". Or "Argument $bark picked
> 'Dog where...' with distance 1-ε over Dog with distance 1".
This is exactly the kind of coverage tools I mentioned above. I think it would
suffice to have a module that provides an <is targeting> trait.
> Unless we can feed more specific information to the naive user
> in an easily digestible form, I'm still inclined to say that *any*
> constraint just subtracts "epsilon" from the distance, and if you don't
> write your constraints to be mutually exclusive within a single file,
> and you depend on the dispatch order to distinguish, it's erroneous.
I very strongly support this approach. Perhaps with the elaboration that each
re-specialization subtracts an additional epsilon. So I could distinguish:
type SingleDigit := Int where [0..9];
type Three := SingleDigit where 3;
multi sub foo(Int n) {...} #1
multi sub foo(SingleDigit n) {...} #2
multi sub foo(Three n) {...} #3
foo(3); # dispatches to #3 (distance = -2ε)
foo(4); # dispatches to #2 (distance = -ε)
foo(43); # dispatches to #1 (distance = 0)
> (We could still subtract addtional values of epsilon for later files
> to make Yuval happy--or at least less unhappy...)
This would make Damian very unhappy as it discriminates against good
development practices like refactoring code into modules.
> We could go as far as to make it:
>
> Argument $bark picked Dog where... with distance 0.99
> over Dog with distance 1
>
> Then Yuval's overriding files can be distance 0.98, 0.97, 0.96, etc.
> An epsilon of .01 should be small enough for anyone. (Or at least
> any engineer.)
Magic numbers are a Really Bad Idea. We managed to avoid them for both
operator precedence and regular MMD. It would be a real shame to introduce
them here.
And I think they're unnecessary. Cumulative infinitesimal epsilons from
cumulative C<where> modifiers does the job just as well, and has the distinct
advantage of not restricting specializations to 99 levels.
> The warner should also detect ambiguities in constraints if we make all
> contraints in a file the same epsilon. I just showed the warning for
> a single argument, but it should probably tell the distance on all
> the arguments that differ, and maybe even calculate the overall distance
> for them.
Again, Class::Multimethods has prior art for this approach. See the
demo.analyse.pl example included in the distribution.
Damian
I guess I have used MMD more than most people in this discussion. Indeed,
having both written Class::Multimethods and supervised a PhD that involved
adding MMD to C++, one might have assumed that I've already "served my
sentence" ;-).
Nevertheless, all that experience has convinced me that the simpler the
dispatch rules, the more usable the resulting MMD mechanism is. That's why
I've consistently advocated uniform Manhattan distance over
"left-most-best-fit". That's why I've always recommended an explicit C<is
default> marker. That's why I was opposed to any particular numerical epsilon
value. That's why I don't favour special treatment for junctions or multiple
inheritance trees.
The goal is always the same: to find a parameter list that most accurately
matches the argument list, taking into account the type generalizations
introduced by inheritance and the type specializations introduced by C<where>
clauses.
So, in my view the MMD mechanism ought to be something like:
1. Gather all visible variants with a compatible number of
parameters (taking into account the requirements of any C<where>
constraints)
2. If there are no such variants, throw a "no such multi" exception
3. Work out the Manhattan distance from the argument list to each
variant's parameter list.
4. If there is a unique minimum, call that variant
5. Otherwise, discard every variant whose Manhattan distance
isn't minimal
5. Work out the degree of specialization of each remaining argument
list (i.e. the total number of C<where> specializations on the
variant's complete set of parameters)
6. If there is a unique maximum, call that variant
7. Otherwise, if there is a compatible variant with an <is default>
trait, call that variant
8. Otherwise, throw an "ambiguous call" exception.
This is a much less dwimmy solution than Yuval's or Luke's, but it has the
advantage that those eight steps reduce to eight words:
Unique least-inherited most-specialized match, or default
which will fit into most people's heads and still DWIM most of the time.
Note that specializations enter into the decision process at two points:
initially, they must be satisfied if the variant is to be considered "viable";
later, they are used as tie-breakers when resolving ambiguities.
Using them to select the initial set of viable candidates is critical. If I have:
multi sub foo (Int $x where { $^x < 10 }) {...}
multi sub foo (Num $x) {...}
then I almost certainly want a call to:
foo(42);
to successfully call the second variant, rather than throwing an exception like:
Can't call multi sub foo (Int $x where { $^x < 10 }) when $x is 42.
Keeping C<where> clauses as the sole "more specialized" marker also gives the
developer *more* control than adding extra rules for junctions would. For
example, someone might prefer to treat all junctions as being equally special:
multi sub Foo(Int&Str) {...}
multi sub foo(Int) {...}
or treat junctions as more special:
multi sub Foo((Int|Str) where Any) {...}
multi sub foo(Int) {...}
or treat junctions as less special:
multi sub Foo(Int|Str) {...}
multi sub foo(Int where Any) {...}
In each case, these variations in "significance" are now explicitly and
consistently marked.
Damian
OK, sorry if I missed this in an earlier discussion. For purposes of
calculating this Manhattan distance, I gather that we're treating lists of N
arguments/parameters as points in N-space. I further assume that the
monoaxial distance between a parameter coördinate and the corresponding
argument coördinate - the distance between two types, where the types are
known to be assignment-compatible - is the number of inheritance steps
between them?
And one more dumb question: why is it that the L[1] metric is superior to
the L[2] metric for this purpose?
The geometric interpretation does bring us into somewhat philosophical
territory. Not that that's anything new on this list. :)
Let me try a concrete example. Suppose that class Answer has subclasses
Animal, Vegetable, and Mineral, with respective subclasses Dog, Potato, and
Diamond. There are two methods named foo in scope, neither overriding the
other. One is declared to take (Animal, Vegetable, Mineral), the other
(Dog, Potato, Answer). Assuming the obvious memberships, which method
should foo(Snoopy, Mr_PotatoHead, HopeDiamond) call? And more importantly,
why do you feel that is the right answer?
According to Damian's metric, we have distances of 0+0+2=2 and 1+1+1=3, so
(Dog, Potato, Answer) is "closer" and would get called.
It doesn't seem to make much practical sense. Multimethods are
generally written to be exclusive of ancestral methods. Ordinary
methods are generally written to be cumulative with ancestral methods.
Larry
I just noticed that our rewrite doesn't quite work unless you rewrite
every "when" clause in the first form to also return $_, since "when"
blocks would escape past the return of the $_. Any form of "leave"
could have the same problem. I think the proper semantics of "but"
are that it ignores any return value however generated and pretends
the topic was returned. In fact, the original closure should probably
be evaluated in void context. So it's doing something more complicated
like:
my $foo = do given Cls.new {
given $_ {
.attr = 1;
}
$_;
}
};
But hey, that just makes the monkey-but sugar seem all the sweeter. :P
Larry
> OK, sorry if I missed this in an earlier discussion. For purposes of
> calculating this Manhattan distance, I gather that we're treating lists of N
> arguments/parameters as points in N-space. I further assume that the
> monoaxial distance between a parameter coördinate and the corresponding
> argument coördinate - the distance between two types, where the types are
> known to be assignment-compatible - is the number of inheritance steps
> between them?
Correct. This is the usual underlying metric, regardless of MMD scheme.
> And one more dumb question: why is it that the L[1] metric is superior to
> the L[2] metric for this purpose?
The use of summed lineal distance (L[1]) rather than RMS distance (L[2])
probably *isn't* superior as a closeness measure. But it's computationally
much simpler (and hence likely to be more efficient), it doesn't suffer from
precision issues in "photo finishes", and it is almost certainly easier for
the average programmer to predict correctly.
That said, I'd have no *particular* objection to an MMD implementation that
used RMS inheritance distance as its metric, provided the dispatch performance
was not appreciably worse.
Damian
>> Unique least-inherited most-specialized match, or default
>
>
> Do I read this correctly as dispatching partly in the class hierarchy
> and partly in the type hierarchy?
Err. The class hierarchy *is* the type hierarchy in Perl 6.
> Or do you mean with 'least-inherited'
> most specific non-where type and with 'most-specialized' the strictest
> where clause?
I mean: least cumulative derivation distance summed over the set of parameter
types, with greatest *number* of parameter specializations as a tie-breaker
(if required).
Damian