As recently discussed here: http://www.reddit.com/r/programming/comments/mmrmj/the_expression_problem/
The Expression Problem is a new name for an old problem. The goal is
to define a datatype by cases, where one can add new cases to the
datatype and new functions over the datatype, without recompiling
existing code, and while retaining static type safety (e.g., no casts)
I'm guessing this is how you would approach it using Go interfaces:
// shape package defines squares and circles with an area method
type Shape interface {
Area() float64
}
type Square struct {
Side float64
}
func (r *Square) Area() float64 {
return r.Side * r.Side
}
type Circle struct {
Radius float64
}
func (c *Circle) Area() float64 {
return math.Pi * c.Radius * c.Radius
}
// PShape package extends existing shapes with a perimeter method
type PShape interface {
Shape
Perimeter() float64
}
type PSquare Square
func (s *PSquare) Area() float64 {
return (*Square)(s).Area()
}
func (s *PSquare) Perimeter() float64 {
return s.Side * 4
}
type PCircle Circle
func (c *PCircle) Area() float64 {
return (*Circle)(c).Area()
}
func (c *PCircle) Perimeter() float64 {
return 2 * math.Pi * c.Radius
}
// Rect package adds a new shape that implements both shape and pshape
interfaces
type Rect struct {
W, H float64
}
func (r *Rect) Area() float64 {
return r.W * r.H
}
func (r *Rect) Perimeter() float64 {
return r.W*2 + r.H*2
}
// test package
func printArea(s Shape) {
fmt.Println("area ", s.Area())
}
func printPerimeter(s PShape) {
fmt.Println("perimeter ", s.Perimeter())
}
func printBoth(s PShape) {
printArea(s)
printPerimeter(s)
fmt.Println()
}
func main() {
s := &Square{2}
c := &Circle{1}
r := &Rect{2, 3}
printArea(s)
printArea(c)
printArea(r)
printPerimeter((*PSquare)(s))
printPerimeter((*PCircle)(c))
printPerimeter(r)
fmt.Println()
printBoth((*PSquare)(s))
printBoth((*PCircle)(c))
printBoth(r)
}
Note: I wrote this in one file, but the idea is that it's separated
into different packages and that you should be able to both add new
shapes and add new methods without changing the source code of the
original package. (Or even recompiling it)
While this solution is reasonably nice, there's two problems:
1. Extending the interface requires you to re-implement the existing
functions (Area() in this case)
2. The client needs to manually handle type casts to extended
interfaces
Is there a better way?
Hi mailing list, this might be an interesting problem for you:
As recently discussed here: http://www.reddit.com/r/programming/comments/mmrmj/the_expression_problem/
The Expression Problem is a new name for an old problem. The goal is
to define a datatype by cases, where one can add new cases to the
datatype and new functions over the datatype, without recompiling
existing code, and while retaining static type safety (e.g., no casts)
I'm guessing this is how you would approach it using Go interfaces:
<snip>
While this solution is reasonably nice, there's two problems:
1. Extending the interface requires you to re-implement the existing
functions (Area() in this case)
2. The client needs to manually handle type casts to extended
interfaces
This allows us to avoid writing the boilerplate for the existing
methods.
However, now you can't typecast a *Square into a *PSquare anymore:
printBoth((*PSquare)(s)) // not allowed
So you have to create a new object:
printBoth(&PSquare{s}) // works
Something which is kinda strange/interesting is that it's actually
possible to modify the embedded "parent" pointer of an object:
ps := PSquare{s}
printBoth(&ps)
ps.Square := s2
printBoth(&ps)
I do not understand why designing a special-case language and compiler
for the Expression Problem would be an issue. You "just" design such a
language and implement the compiler. No?
>> type PSquare struct { *Square }
>>
>
> This allows us to avoid writing the boilerplate for the existing
> methods.
>
> However, now you can't typecast a *Square into a *PSquare anymore:
You don't need type casts; a composite literal and a selection achieve the same result:
ps := PSquare{s}
s2 := ps.Square
> So you have to create a new object:
>
> printBoth(&PSquare{s}) // works
Don't use a *PSquare; a PSquare is already a [struct containing a] pointer to a Square. Notice I wrote the method with a PSquare receiver. "Creating a new object" is an overstatement. All the compiler does is notice that the type has changed. The representation is identical.
> Something which is kinda strange/interesting is that it's actually
> possible to modify the embedded "parent" pointer of an object:
>
> ps.Square := s2
Replacing a struct and replacing its single field are the same operation. Thus, this is equivalent to:
ps = PSquare{s2}
which is
ps = (*PSquare)(s2)
in the other case. Not all that odd. Just don't think of the PSquare struct as a separate object. It only exists for the sake of type analysis.
Oh, I missed that you left the * out. Yeah, this version is probably
the best one.
In my opinion, Go does not have any kind of "native support" for the
Expression Problem (EP).
I am not an expert on EP. The following pseudo-code demonstrates my
current understanding of EP (--- separates compilation units (files),
T is time):
--- (T=0)
// 'Expression' is an interface with an 'Eval()' method
Expression: add method Eval() -> int
--- (T=1)
// 'Constant' implements 'Expression@T=0'
type Constant: v:int
Constant.Eval() = v
// 'Addition' implements 'Expression@T=0'
type Addition: a:Expression b:Expression
Addition.Eval() = a.Eval()+b.Eval()
--- (T=2)
// Add new method 'String()' to interface 'Expression'
Expression: add method String() -> string
// The compiler demands 'Constant' and 'Addition'
// to implement the newly added method.
// Both will implement 'Expression@T=2'.
Constant.String() = itoa(v)
Addition.String() = a.String() ++ "+" ++ b.String()
--- (T=3)
// New datatype 'Subtraction' implements 'Expression@T=2'
type Subtraction: a:Expression b:Expression
Subtraction.Eval() = a.Eval()-b.Eval()
Subtraction.String() = a.String() ++ "-" ++ b.String()
--- (end of pseudo-code)
This is only a small example, so it is possible that expanding the
code and using EP on a regular basis would reveal some issues I do not
see yet, such as type system issues.
And that the compiler would then check that all existing types that
implement Expression now implements the String() method.
It would need to do this because otherwise lines like this would not
type check:
> Subtraction.String() = a.String() ++ "-" ++ b.String()
I don't think this is the right interpretation, because that would be
very fragile.
As soon as anyone decided to "monkey patch" in an additional method,
all existing code would break.
And also, the (somewhat difficult to read) code in Wadler's paper
seems to extend the interface by introducing a new interface name
("Lang2"):
http://www.cs.purdue.edu/homes/jv/510s05/papers/wadler.txt
So I argue that step (T=2) in your example should introduce an
extended interface with a new name.
But this is also where it gets a bit complicated. You can see that
Wadler's solution is actually quite a mess, there's now two versions
of everything:
Lang.Exp vs Lang2.Exp
Lang.Plus vs Lang2.Plus
Lang.Eval vs Lang2.Eval
etc..
So I would say Go manages to solve it quite well:
http://pastebin.com/GLhaRtQM
Yes.
>
> > Expression: add method String() -> string
>
> And that the compiler would then check that all existing types that
> implement Expression now implements the String() method.
Yes.
> It would need to do this because otherwise lines like this would not
> type check:
>
> > Subtraction.String() = a.String() ++ "-" ++ b.String()
>
> I don't think this is the right interpretation, because that would be
> very fragile.
> As soon as anyone decided to "monkey patch" in an additional method,
> all existing code would break.
Existing code wouldn't break, because adding a method and not
implementing it for all datatypes would be rejected by the compiler.
The pseudo-code I posted probably should have explicitly stated that
Constant (and the other datatypes) implement Expression:
type Constant: v:int
Constant implements Expression
> And also, the (somewhat difficult to read) code in Wadler's paper
> seems to extend the interface by introducing a new interface name
> ("Lang2"):http://www.cs.purdue.edu/homes/jv/510s05/papers/wadler.txt
Indeed. I do not understand why.
>
> So I argue that step (T=2) in your example should introduce an
> extended interface with a new name.
What would be the purpose of that?
> But this is also where it gets a bit complicated. You can see that
> Wadler's solution is actually quite a mess, there's now two versions
> of everything:
> Lang.Exp vs Lang2.Exp
> Lang.Plus vs Lang2.Plus
> Lang.Eval vs Lang2.Eval
> etc..
>
> So I would say Go manages to solve it quite well:http://pastebin.com/GLhaRtQM
The stringify() function in your Go code is violating static type
safety.
The code also violates the requirement that you should not be editing
the stringify() function when you later add another function over the
datatypes (for example, if you later add function
'NumSubExpressions()' returning the number of sub-expressions of an
expression).
Existing code wouldn't break, because adding a method and not
implementing it for all datatypes would be rejected by the compiler.
And also, the (somewhat difficult to read) code in Wadler's paperseems to extend the interface by introducing a new interface name("Lang2"):http://www.cs.purdue.edu/homes/jv/510s05/papers/wadler.txt
Indeed. I do not understand why.
So I argue that step (T=2) in your example should introduce anextended interface with a new name.
What would be the purpose of that?
But this is also where it gets a bit complicated. You can see thatWadler's solution is actually quite a mess, there's now two versionsof everything:Lang.Exp vs Lang2.ExpLang.Plus vs Lang2.PlusLang.Eval vs Lang2.Evaletc..So I would say Go manages to solve it quite well:http://pastebin.com/GLhaRtQM
The stringify() function in your Go code is violating static type
safety.
If it is a fact that the new method could not logically be written at
all (for a particular datatype), then there is absolutely nothing you
or the compiler or the type system or the language can do about it.
Such methods do not qualify for EP.
> Your code becomes remarkably fragile. If you can ensure that the method can be implemented for all types implementing the interface, then you could just write it as a function over the interface and not bother with the method. Otherwise, you're abusing your abstraction.
I do not understand this.
>
> >> And also, the (somewhat difficult to read) code in Wadler's paper
> >> seems to extend the interface by introducing a new interface name
> >> ("Lang2"):http://www.cs.purdue.edu/homes/jv/510s05/papers/wadler.txt
>
> > Indeed. I do not understand why.
>
> This means you've probably misunderstood the problem. This is much more likely than trying to claim that Wadler misunderstood his own problem...
Maybe. But it is also possible that "Lang2" in
http://www.cs.purdue.edu/homes/jv/510s05/papers/wadler.txt is there
only "by accident".
>
> >> So I argue that step (T=2) in your example should introduce an
> >> extended interface with a new name.
>
> > What would be the purpose of that?
>
> So that you don't break existing code.
In the pseudo-code I posted, there seems to be no need to introduce a
new name of the interface 'Expression' (i.e: Expression2 or
something).
The pseudo-code doesn't break anything - as far as I know.
>
> >> But this is also where it gets a bit complicated. You can see that
> >> Wadler's solution is actually quite a mess, there's now two versions
> >> of everything:
> >> Lang.Exp vs Lang2.Exp
> >> Lang.Plus vs Lang2.Plus
> >> Lang.Eval vs Lang2.Eval
> >> etc..
>
> >> So I would say Go manages to solve it quite well:http://pastebin.com/GLhaRtQM
>
> > The stringify() function in your Go code is violating static type
> > safety.
>
> I wouldn't consider the stringify function as a necessary part of the solution anyways.
Why would the stringify() function be written there if it wasn't
necessary?
> On Nov 27, 7:56 pm, Steven Blenkinsop <steven...@gmail.com> wrote:
>> On Nov 27, 2011, at 1:27 PM, ⚛ <0xe2.0x9a.0...@gmail.com> wrote:
>>
>>> Existing code wouldn't break, because adding a method and not
>>> implementing it for all datatypes would be rejected by the compiler.
>>
>> Which means your code would be uncompilable as soon as there existed a type implementing the original interface for which the new method could not logically be written.
>
> If it is a fact that the new method could not logically be written at
> all (for a particular datatype), then there is absolutely nothing you
> or the compiler or the type system or the language can do about it.
Precisely. And yet, there is nothing stopping another library author from creating such a type in their library, making the two libraries irreconcilable, creating a mess of incompatible libraries and a fractured programming community. And all because of your interface imperialism ;).
> Such methods do not qualify for EP.
Exactly my point. Trying to add a method to all types implementing a given interface does not qualify for EP :).
>> Your code becomes remarkably fragile. If you can ensure that the method can be implemented for all types implementing the interface, then you could just write it as a function over the interface and not bother with the method. Otherwise, you're abusing your abstraction.
>
> I do not understand this.
The only way to ensure that all types implementing a given interface also support some other computation is if that computation can be defined purely on the interface. I'm not sure of how else to describe this, it's a bit of a truism. The only way to know if your abstraction supports an operation is if it supports the operation directly.
>>> Indeed. I do not understand why.
>>
>> This means you've probably misunderstood the problem. This is much more likely than trying to claim that Wadler misunderstood his own problem...
>
> Maybe. But it is also possible that "Lang2" in
> http://www.cs.purdue.edu/homes/jv/510s05/papers/wadler.txt is there
> only "by accident".
I'd say no, it's completely intentional. I wouldn't want to use any language that "solved" the problem in a way that satisfies your interpretation of it.
> In the pseudo-code I posted, there seems to be no need to introduce a
> new name of the interface 'Expression' (i.e: Expression2 or
> something).
>
> The pseudo-code doesn't break anything - as far as I know.
That's, again, the problem. "As far as you know". Unless you can claim to know all that code that uses the interface you are altering, then you can't know that your change doesn't break said code irreparably.
>> I wouldn't consider the stringify function as a necessary part of the solution anyways.
>
> Why would the stringify() function be written there if it wasn't
> necessary?
Because perhaps I disagree with the person who wrote it (at the time that they did) on its necessity. It comes back to, either you can write the function entirely in terms of the original interface, or else you can't ensure that all types implementing the original interface can logically support such a function, in which case the function can't be required to be possible by EP.
Although it is more "clean" to leave them out, I thought it would be
convenient to have a way of easily stringifying existing trees.
However this comes with a cost of added complexity and a bit of
fragility...
I agree, and that way is: "don't do that."
It is telling that the solution to this "problem" in GJ starts with:
class LangF<This extends LangF<This>> {
...
}
final class Lang extends LangF<Lang> {}
Is what you are trying to do really so complex that you need
a recursive definition of the inheritance hierarchy for your type?
Probably not.
I would argue that in the vast majority of cases, reaching for
this kind of pattern is overkill. It's cute that you can do it but
in practice is it really worth the complexity?
Russ