The expression problem in go

539 views
Skip to first unread message

Mikael Gustavsson

unread,
Nov 25, 2011, 4:58:44 AM11/25/11
to golang-nuts
Hi mailing list, this might be an interesting problem for you:

As recently discussed here: http://www.reddit.com/r/programming/comments/mmrmj/the_expression_problem/

The Expression Problem is a new name for an old problem. The goal is
to define a datatype by cases, where one can add new cases to the
datatype and new functions over the datatype, without recompiling
existing code, and while retaining static type safety (e.g., no casts)

I'm guessing this is how you would approach it using Go interfaces:

// shape package defines squares and circles with an area method
type Shape interface {
Area() float64
}

type Square struct {
Side float64
}

func (r *Square) Area() float64 {
return r.Side * r.Side
}

type Circle struct {
Radius float64
}

func (c *Circle) Area() float64 {
return math.Pi * c.Radius * c.Radius
}

// PShape package extends existing shapes with a perimeter method
type PShape interface {
Shape
Perimeter() float64
}

type PSquare Square

func (s *PSquare) Area() float64 {
return (*Square)(s).Area()
}

func (s *PSquare) Perimeter() float64 {
return s.Side * 4
}

type PCircle Circle

func (c *PCircle) Area() float64 {
return (*Circle)(c).Area()
}

func (c *PCircle) Perimeter() float64 {
return 2 * math.Pi * c.Radius
}

// Rect package adds a new shape that implements both shape and pshape
interfaces
type Rect struct {
W, H float64
}

func (r *Rect) Area() float64 {
return r.W * r.H
}

func (r *Rect) Perimeter() float64 {
return r.W*2 + r.H*2
}

// test package
func printArea(s Shape) {
fmt.Println("area ", s.Area())
}

func printPerimeter(s PShape) {
fmt.Println("perimeter ", s.Perimeter())
}

func printBoth(s PShape) {
printArea(s)
printPerimeter(s)
fmt.Println()
}

func main() {
s := &Square{2}
c := &Circle{1}
r := &Rect{2, 3}

printArea(s)
printArea(c)
printArea(r)

printPerimeter((*PSquare)(s))
printPerimeter((*PCircle)(c))
printPerimeter(r)

fmt.Println()
printBoth((*PSquare)(s))
printBoth((*PCircle)(c))
printBoth(r)
}

Note: I wrote this in one file, but the idea is that it's separated
into different packages and that you should be able to both add new
shapes and add new methods without changing the source code of the
original package. (Or even recompiling it)

While this solution is reasonably nice, there's two problems:
1. Extending the interface requires you to re-implement the existing
functions (Area() in this case)
2. The client needs to manually handle type casts to extended
interfaces

Is there a better way?

Steven Blenkinsop

unread,
Nov 25, 2011, 3:54:45 PM11/25/11
to Mikael Gustavsson, golang-nuts
On Fri, Nov 25, 2011 at 4:58 AM, Mikael Gustavsson <slv...@gmail.com> wrote:
Hi mailing list, this might be an interesting problem for you:

As recently discussed here: http://www.reddit.com/r/programming/comments/mmrmj/the_expression_problem/

The Expression Problem is a new name for an old problem. The goal is
to define a datatype by cases, where one can add new cases to the
datatype and new functions over the datatype, without recompiling
existing code, and while retaining static type safety (e.g., no casts)

I'm guessing this is how you would approach it using Go interfaces:

<snip>
 
While this solution is reasonably nice, there's two problems:
1. Extending the interface requires you to re-implement the existing
functions (Area() in this case)

type PSquare struct { *Square }

func (s PSquare) Perimeter() float64 {
       return s.Side * 4
}
 
OR

type PSquare Square

func (s *PSquare) Area() float64        { return (*Square)(s).Area() } 
func (s *PSquare) Perimeter() float64 { return s.Size * 4 }

2. The client needs to manually handle type casts to extended
interfaces
 
Honestly, it shouldn't come up much. When you use the types in a code path that is aware of the extended interface, presumably so are you, so you can make the choice up front while the type is statically known. Also, I'd consider it a good thing, since it means that the change doesn't impact old code paths in unexpected ways.

Mikael Gustavsson

unread,
Nov 26, 2011, 7:19:40 AM11/26/11
to golang-nuts
> type PSquare struct { *Square }
>

This allows us to avoid writing the boilerplate for the existing
methods.

However, now you can't typecast a *Square into a *PSquare anymore:

printBoth((*PSquare)(s)) // not allowed

So you have to create a new object:

printBoth(&PSquare{s}) // works

Something which is kinda strange/interesting is that it's actually
possible to modify the embedded "parent" pointer of an object:

ps := PSquare{s}
printBoth(&ps)
ps.Square := s2
printBoth(&ps)

unread,
Nov 26, 2011, 11:43:39 AM11/26/11
to golang-nuts
On Nov 25, 10:58 am, Mikael Gustavsson <slv...@gmail.com> wrote:
> Hi mailing list, this might be an interesting problem for you:
>
> As recently discussed here:http://www.reddit.com/r/programming/comments/mmrmj/the_expression_pro...

>
> The Expression Problem is a new name for an old problem. The goal is
> to define a datatype by cases, where one can add new cases to the
> datatype and new functions over the datatype, without recompiling
> existing code, and while retaining static type safety (e.g., no casts)

I do not understand why designing a special-case language and compiler
for the Expression Problem would be an issue. You "just" design such a
language and implement the compiler. No?

Steven Blenkinsop

unread,
Nov 26, 2011, 1:09:55 PM11/26/11
to Mikael Gustavsson, golang-nuts
On Nov 26, 2011, at 7:19 AM, Mikael Gustavsson <slv...@gmail.com> wrote:

>> type PSquare struct { *Square }
>>
>
> This allows us to avoid writing the boilerplate for the existing
> methods.
>
> However, now you can't typecast a *Square into a *PSquare anymore:

You don't need type casts; a composite literal and a selection achieve the same result:
ps := PSquare{s}
s2 := ps.Square

> So you have to create a new object:
>
> printBoth(&PSquare{s}) // works

Don't use a *PSquare; a PSquare is already a [struct containing a] pointer to a Square. Notice I wrote the method with a PSquare receiver. "Creating a new object" is an overstatement. All the compiler does is notice that the type has changed. The representation is identical.

> Something which is kinda strange/interesting is that it's actually
> possible to modify the embedded "parent" pointer of an object:
>

> ps.Square := s2

Replacing a struct and replacing its single field are the same operation. Thus, this is equivalent to:
ps = PSquare{s2}

which is
ps = (*PSquare)(s2)

in the other case. Not all that odd. Just don't think of the PSquare struct as a separate object. It only exists for the sake of type analysis.

Mikael Gustavsson

unread,
Nov 26, 2011, 1:46:49 PM11/26/11
to golang-nuts
> I do not understand why designing a special-case language and compiler> for the Expression Problem would be an issue. You "just" design such a> language and implement the compiler. No?
That might be so, yet very few if any languages does solve the
problem. (Retaining independent compilation and static typing.)
Go interfaces apparently does solve it and they seem more of a novel
idea than a complicated implementation.
On the other hand one could argue that Go doesn't solve the problem,
since it doesn't allow adding methods to the original
datatypes.Although AFAIK this is by design, and not because it
couldn't be implemented.

Mikael Gustavsson

unread,
Nov 26, 2011, 2:02:06 PM11/26/11
to golang-nuts
> Don't use a *PSquare; a PSquare is already a [struct containing a] pointer to a Square. Notice I wrote the method with a PSquare receiver. "Creating a new object" is an overstatement. All the compiler does is notice that the type has changed. The representation is identical.

Oh, I missed that you left the * out. Yeah, this version is probably
the best one.

unread,
Nov 27, 2011, 5:47:08 AM11/27/11
to golang-nuts

In my opinion, Go does not have any kind of "native support" for the
Expression Problem (EP).

I am not an expert on EP. The following pseudo-code demonstrates my
current understanding of EP (--- separates compilation units (files),
T is time):

--- (T=0)

// 'Expression' is an interface with an 'Eval()' method
Expression: add method Eval() -> int

--- (T=1)

// 'Constant' implements 'Expression@T=0'
type Constant: v:int
Constant.Eval() = v

// 'Addition' implements 'Expression@T=0'
type Addition: a:Expression b:Expression
Addition.Eval() = a.Eval()+b.Eval()

--- (T=2)

// Add new method 'String()' to interface 'Expression'
Expression: add method String() -> string

// The compiler demands 'Constant' and 'Addition'
// to implement the newly added method.
// Both will implement 'Expression@T=2'.
Constant.String() = itoa(v)
Addition.String() = a.String() ++ "+" ++ b.String()

--- (T=3)

// New datatype 'Subtraction' implements 'Expression@T=2'
type Subtraction: a:Expression b:Expression
Subtraction.Eval() = a.Eval()-b.Eval()
Subtraction.String() = a.String() ++ "-" ++ b.String()

--- (end of pseudo-code)

This is only a small example, so it is possible that expanding the
code and using EP on a regular basis would reveal some issues I do not
see yet, such as type system issues.

Mikael Gustavsson

unread,
Nov 27, 2011, 12:01:35 PM11/27/11
to golang-nuts
I take it that your interpretation is that it should be possible to
extend an interface "in place":

> Expression: add method String() -> string

And that the compiler would then check that all existing types that
implement Expression now implements the String() method.
It would need to do this because otherwise lines like this would not
type check:


> Subtraction.String() = a.String() ++ "-" ++ b.String()

I don't think this is the right interpretation, because that would be
very fragile.
As soon as anyone decided to "monkey patch" in an additional method,
all existing code would break.
And also, the (somewhat difficult to read) code in Wadler's paper
seems to extend the interface by introducing a new interface name
("Lang2"):
http://www.cs.purdue.edu/homes/jv/510s05/papers/wadler.txt

So I argue that step (T=2) in your example should introduce an
extended interface with a new name.
But this is also where it gets a bit complicated. You can see that
Wadler's solution is actually quite a mess, there's now two versions
of everything:
Lang.Exp vs Lang2.Exp
Lang.Plus vs Lang2.Plus
Lang.Eval vs Lang2.Eval
etc..

So I would say Go manages to solve it quite well:
http://pastebin.com/GLhaRtQM

unread,
Nov 27, 2011, 1:27:49 PM11/27/11
to golang-nuts
On Nov 27, 6:01 pm, Mikael Gustavsson <slv...@gmail.com> wrote:
> I take it that your interpretation is that it should be possible to
> extend an interface "in place":

Yes.

>
> > Expression: add method String() -> string
>
> And that the compiler would then check that all existing types that
> implement Expression now implements the String() method.

Yes.

> It would need to do this because otherwise lines like this would not
> type check:
>
> > Subtraction.String() = a.String() ++ "-" ++ b.String()
>
> I don't think this is the right interpretation, because that would be
> very fragile.
> As soon as anyone decided to "monkey patch" in an additional method,
> all existing code would break.

Existing code wouldn't break, because adding a method and not
implementing it for all datatypes would be rejected by the compiler.

The pseudo-code I posted probably should have explicitly stated that
Constant (and the other datatypes) implement Expression:

type Constant: v:int
Constant implements Expression

> And also, the (somewhat difficult to read) code in Wadler's paper
> seems to extend the interface by introducing a new interface name
> ("Lang2"):http://www.cs.purdue.edu/homes/jv/510s05/papers/wadler.txt

Indeed. I do not understand why.

>
> So I argue that step (T=2) in your example should introduce an
> extended interface with a new name.

What would be the purpose of that?

> But this is also where it gets a bit complicated. You can see that
> Wadler's solution is actually quite a mess, there's now two versions
> of everything:
> Lang.Exp vs Lang2.Exp
> Lang.Plus vs Lang2.Plus
> Lang.Eval vs Lang2.Eval
> etc..
>
> So I would say Go manages to solve it quite well:http://pastebin.com/GLhaRtQM

The stringify() function in your Go code is violating static type
safety.

The code also violates the requirement that you should not be editing
the stringify() function when you later add another function over the
datatypes (for example, if you later add function
'NumSubExpressions()' returning the number of sub-expressions of an
expression).

Steven Blenkinsop

unread,
Nov 27, 2011, 1:56:14 PM11/27/11
to ⚛, golang-nuts
On Nov 27, 2011, at 1:27 PM, ⚛ <0xe2.0x...@gmail.com> wrote:

Existing code wouldn't break, because adding a method and not
implementing it for all datatypes would be rejected by the compiler.

Which means your code would  be uncompilable as soon as there existed a type implementing the original interface for which the new method could not logically be written. Your code becomes remarkably fragile. If you can ensure that the method can be implemented for all types implementing the interface, then you could just write it as a function over the interface and not bother with the method. Otherwise, you're abusing your abstraction.

And also, the (somewhat difficult to read) code in Wadler's paper
seems to extend the interface by introducing a new interface name
("Lang2"):http://www.cs.purdue.edu/homes/jv/510s05/papers/wadler.txt

Indeed. I do not understand why.

This means you've probably misunderstood the problem. This is much more likely than trying to claim that Wadler misunderstood his own problem...

So I argue that step (T=2) in your example should introduce an
extended interface with a new name.

What would be the purpose of that?

So that you don't break existing code.

But this is also where it gets a bit complicated. You can see that
Wadler's solution is actually quite a mess, there's now two versions
of everything:
Lang.Exp vs Lang2.Exp
Lang.Plus vs Lang2.Plus
Lang.Eval vs Lang2.Eval
etc..

So I would say Go manages to solve it quite well:http://pastebin.com/GLhaRtQM

The stringify() function in your Go code is violating static type
safety.

I wouldn't consider the stringify function as a necessary part of the solution anyways. If the code knows its type can be upgraded in this way, it also knows how. If it doesn't, then it is incorrect regardless.

unread,
Nov 27, 2011, 2:19:59 PM11/27/11
to golang-nuts
On Nov 27, 7:56 pm, Steven Blenkinsop <steven...@gmail.com> wrote:

> On Nov 27, 2011, at 1:27 PM, ⚛ <0xe2.0x9a.0...@gmail.com> wrote:
>
> > Existing code wouldn't break, because adding a method and not
> > implementing it for all datatypes would be rejected by the compiler.
>
> Which means your code would  be uncompilable as soon as there existed a type implementing the original interface for which the new method could not logically be written.

If it is a fact that the new method could not logically be written at
all (for a particular datatype), then there is absolutely nothing you
or the compiler or the type system or the language can do about it.

Such methods do not qualify for EP.

> Your code becomes remarkably fragile. If you can ensure that the method can be implemented for all types implementing the interface, then you could just write it as a function over the interface and not bother with the method. Otherwise, you're abusing your abstraction.

I do not understand this.

>
> >> And also, the (somewhat difficult to read) code in Wadler's paper
> >> seems to extend the interface by introducing a new interface name
> >> ("Lang2"):http://www.cs.purdue.edu/homes/jv/510s05/papers/wadler.txt
>
> > Indeed. I do not understand why.
>
> This means you've probably misunderstood the problem. This is much more likely than trying to claim that Wadler misunderstood his own problem...

Maybe. But it is also possible that "Lang2" in
http://www.cs.purdue.edu/homes/jv/510s05/papers/wadler.txt is there
only "by accident".

>
> >> So I argue that step (T=2) in your example should introduce an
> >> extended interface with a new name.
>
> > What would be the purpose of that?
>
> So that you don't break existing code.

In the pseudo-code I posted, there seems to be no need to introduce a
new name of the interface 'Expression' (i.e: Expression2 or
something).

The pseudo-code doesn't break anything - as far as I know.

>
> >> But this is also where it gets a bit complicated. You can see that
> >> Wadler's solution is actually quite a mess, there's now two versions
> >> of everything:
> >> Lang.Exp vs Lang2.Exp
> >> Lang.Plus vs Lang2.Plus
> >> Lang.Eval vs Lang2.Eval
> >> etc..
>
> >> So I would say Go manages to solve it quite well:http://pastebin.com/GLhaRtQM
>
> > The stringify() function in your Go code is violating static type
> > safety.
>
> I wouldn't consider the stringify function as a necessary part of the solution anyways.

Why would the stringify() function be written there if it wasn't
necessary?

Steven Blenkinsop

unread,
Nov 27, 2011, 2:49:00 PM11/27/11
to ⚛, golang-nuts
On Nov 27, 2011, at 2:19 PM, ⚛ <0xe2.0x...@gmail.com> wrote:

> On Nov 27, 7:56 pm, Steven Blenkinsop <steven...@gmail.com> wrote:
>> On Nov 27, 2011, at 1:27 PM, ⚛ <0xe2.0x9a.0...@gmail.com> wrote:
>>
>>> Existing code wouldn't break, because adding a method and not
>>> implementing it for all datatypes would be rejected by the compiler.
>>
>> Which means your code would be uncompilable as soon as there existed a type implementing the original interface for which the new method could not logically be written.
>
> If it is a fact that the new method could not logically be written at
> all (for a particular datatype), then there is absolutely nothing you
> or the compiler or the type system or the language can do about it.

Precisely. And yet, there is nothing stopping another library author from creating such a type in their library, making the two libraries irreconcilable, creating a mess of incompatible libraries and a fractured programming community. And all because of your interface imperialism ;).

> Such methods do not qualify for EP.

Exactly my point. Trying to add a method to all types implementing a given interface does not qualify for EP :).

>> Your code becomes remarkably fragile. If you can ensure that the method can be implemented for all types implementing the interface, then you could just write it as a function over the interface and not bother with the method. Otherwise, you're abusing your abstraction.
>
> I do not understand this.

The only way to ensure that all types implementing a given interface also support some other computation is if that computation can be defined purely on the interface. I'm not sure of how else to describe this, it's a bit of a truism. The only way to know if your abstraction supports an operation is if it supports the operation directly.

>>> Indeed. I do not understand why.
>>
>> This means you've probably misunderstood the problem. This is much more likely than trying to claim that Wadler misunderstood his own problem...
>
> Maybe. But it is also possible that "Lang2" in
> http://www.cs.purdue.edu/homes/jv/510s05/papers/wadler.txt is there
> only "by accident".

I'd say no, it's completely intentional. I wouldn't want to use any language that "solved" the problem in a way that satisfies your interpretation of it.

> In the pseudo-code I posted, there seems to be no need to introduce a
> new name of the interface 'Expression' (i.e: Expression2 or
> something).
>
> The pseudo-code doesn't break anything - as far as I know.

That's, again, the problem. "As far as you know". Unless you can claim to know all that code that uses the interface you are altering, then you can't know that your change doesn't break said code irreparably.

>> I wouldn't consider the stringify function as a necessary part of the solution anyways.
>
> Why would the stringify() function be written there if it wasn't
> necessary?

Because perhaps I disagree with the person who wrote it (at the time that they did) on its necessity. It comes back to, either you can write the function entirely in terms of the original interface, or else you can't ensure that all types implementing the original interface can logically support such a function, in which case the function can't be required to be possible by EP.

Mikael Gustavsson

unread,
Nov 27, 2011, 10:29:11 PM11/27/11
to golang-nuts
The stringify function and the StringedExp type from the code can be
removed and then the solution will be more like Wadler's.
That is, if you want to be able to string an expression you need to
build it out of the new extended types which are stringable.

Although it is more "clean" to leave them out, I thought it would be
convenient to have a way of easily stringifying existing trees.
However this comes with a cost of added complexity and a bit of
fragility...

Kyle Lemons

unread,
Nov 28, 2011, 12:59:19 AM11/28/11
to golang-nuts
In my mind, the Expression Problem is addressed in Go in a way that probably wouldn't make my Computer Science professors happy but which works out great in day-to-day programming.

When you define an entity (a type in Go), you establish the data it contains.  When you need a behavior (an interface in Go), you describe the actions (methods in Go) it requires.  Entities can implement the behaviors that make sense by providing the appropriate actions.  This is all pretty normal, and is about what you have in any normal OO design, and isn't incredibly far from the way it works in FP either.  In both, you end up with geometric proliferation of modifications whenever you try to change something.  Adding a new behavior means that you have to define it for all entities, and adding a new entity means that you have to define all of its actions.

In Go, by virtue of the orthogonality of interfaces, you can avoid some of this by defining very small, focused interfaces, and in some cases by making them "optional" and checking for them as you go (like Hijacker).  It may be that the new behavior you want will have to be present in all possible entities, but in the many cases where it isn't (say you're trying to Render lots of things, only some of which can or should be rendered Offscreen) the code winds up much cleaner, and the changes you have to make are focused to where the behavior or functionality is actually changing.  The closest that most other languages come to this are mixins or multiple inheritance, both of which come with their own set of drawbacks.

Also, from looking at the various "solutions" to the Expression Problem posed in the original article, it really makes me appreciate the simplicity and explicitness of the Go type/interface system...

Russ Cox

unread,
Nov 28, 2011, 11:32:49 AM11/28/11
to Kyle Lemons, golang-nuts
> In my mind, the Expression Problem is addressed in Go in a way that
> probably wouldn't make my Computer Science professors happy but
> which works out great in day-to-day programming.

I agree, and that way is: "don't do that."

It is telling that the solution to this "problem" in GJ starts with:

class LangF<This extends LangF<This>> {
...
}
final class Lang extends LangF<Lang> {}

Is what you are trying to do really so complex that you need
a recursive definition of the inheritance hierarchy for your type?
Probably not.

I would argue that in the vast majority of cases, reaching for
this kind of pattern is overkill. It's cute that you can do it but
in practice is it really worth the complexity?

Russ

Reply all
Reply to author
Forward
0 new messages