F#

236 views
Skip to first unread message

Jon Harrop

unread,
Nov 1, 2007, 11:01:37โ€ฏPM11/1/07
to jvm-la...@googlegroups.com

Could a CAML derivative language for the JVM gain the same traction that F# is
gaining on .NET?

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

Matt Hellige

unread,
Nov 1, 2007, 11:49:57โ€ฏPM11/1/07
to JVM Languages
On Nov 1, 10:01 pm, Jon Harrop <j...@ffconsultancy.com> wrote:
> Could a CAML derivative language for the JVM gain the same traction that F# is
> gaining on .NET?

Perhaps if someone invested the same amount of money in it... (I guess
that's a bit cynical, but only a bit.)

Matt

John Cowan

unread,
Nov 2, 2007, 12:11:30โ€ฏAM11/2/07
to jvm-la...@googlegroups.com
F# was a research project. I doubt there would be enough research-fu
in just doing it again for the JVM, which is after all very similar.


--
GMail doesn't have rotating .sigs, but you can see mine at
http://www.ccil.org/~cowan/signatures

Jon Harrop

unread,
Nov 2, 2007, 1:15:14โ€ฏAM11/2/07
to jvm-la...@googlegroups.com
On Friday 02 November 2007 04:11, John Cowan wrote:
> F# was a research project. I doubt there would be enough research-fu
> in just doing it again for the JVM, which is after all very similar.

Yes. I think there would be great merit in forgetting about the research side
of things and just writing an ML that targets the JVM. There have been some
attempts at writing compiler backends for languages like SML and OCaml but
nobody has tried writing a decent development environment for them (e.g. with
integrated top-levels).

John Cowan

unread,
Nov 2, 2007, 2:04:21โ€ฏAM11/2/07
to jvm-la...@googlegroups.com
In that case, Ocaml-Java is what you want, at
http://ocamljava.x9c.fr/ .

David Pollak

unread,
Nov 2, 2007, 9:26:50โ€ฏAM11/2/07
to jvm-la...@googlegroups.com
One could always use Scala... :-)

I have not used F#, but my understanding is that the F# team and the Scala team are personal friends and research collaborators and will often integrate the best of each language into the other.

--
lift, the secure, simple, powerful web framework
http://liftweb.net

Jon Harrop

unread,
Nov 2, 2007, 11:51:12โ€ฏAM11/2/07
to jvm-la...@googlegroups.com
On Friday 02 November 2007 13:26, David Pollak wrote:
> One could always use Scala... :-)
>
> I have not used F#, but my understanding is that the F# team and the Scala
> team are personal friends and research collaborators and will often
> integrate the best of each language into the other.

There are various aspects of languages like OCaml that I consider to be of
critical importance. Pattern matching, type inference and static checking are
among those.

F# does quite well in reimplementing them with the exception that its
statically checked pattern matching gives poor error messages (simply stating
when a pattern is incomplete but not describing how by giving examples, as
OCaml does).

However, Scala is far behind in many of these respects and also incorporates a
lot of needless verbosity from Java. For example, Scala only recently
acquired the ability to perform the most rudimentary forms of exhaustiveness
checking. The Eclipse plugin for Scala is behind F#'s (rather lame) Visual
Studio plug-in which is, in turn, far behind the capabilities of OCaml and
Emacs.

These differences put OCaml a league ahead as far as I'm concerned. Thanks to
Microsoft, .NET will catch up and I think it would be great if the JVM could
catchup as well. For this to happen, someone must take the results of
previous research on ML and CAML and apply it to the creation of a new
language and development environment that targets the JVM.

Neil Bartlett

unread,
Nov 2, 2007, 1:10:02โ€ฏPM11/2/07
to jvm-la...@googlegroups.com
Jon,

I think that John Cowan hit the nail on the head. Researchers would
not be interested in -- or be able to get funding for -- a
straightforward "F# for JVM" because it's not likely to yield any
particularly interesting new results that have not already been
discovered by Don Syme's group at MSR Cambridge.

Researchers need to find novel fields to explore, which is why Martin
Odersky and his colleagues at EPFL are able to get funding to work on
Scala. It is precisely because Scala is different from F# that they
are able to work on it.

This leaves two other options. A commercial organisation could do it,
but they would need to be persuaded that they it could make money. For
example Sun might be prepared to do it in the interests of enriching
the Java ecosystem... however they seem to be betting on Ruby rather
than OCaml at the moment. Another possibility would be a tools vendor
like Borland (or CodeGear), but it would only work for them if they
could build an entire suite of tools around the new language and
demonstrate far higher developer productivity versus Java.

I do not think that the above is likely to happen, so the third option
is that the open source community does it. Probably it would require a
sufficiently committed individual developer to write the core as a
proof of concept, which will draw in other developers. Who better than
Jon Harrop, one of the most outspoken proponents of OCaml in the
English-speaking world? ;-)

Good luck Jon! Regards,
Neil

Jon Harrop

unread,
Nov 2, 2007, 1:13:08โ€ฏPM11/2/07
to jvm-la...@googlegroups.com
On Friday 02 November 2007 17:10, Neil Bartlett wrote:
> This leaves two other options. A commercial organisation could do it,
> but they would need to be persuaded that they it could make money. For
> example Sun might be prepared to do it in the interests of enriching
> the Java ecosystem... however they seem to be betting on Ruby rather
> than OCaml at the moment. Another possibility would be a tools vendor
> like Borland (or CodeGear), but it would only work for them if they
> could build an entire suite of tools around the new language and
> demonstrate far higher developer productivity versus Java.

I can see why Sun would bet on Ruby rather than Java: because web programming
is a rich Java market and Ruby plays to that strength. OCaml is much less
interesting for web programming and much more interesting for technical users
like scientists. I believe great things could be done with an ML targetting
the JVM.

> I do not think that the above is likely to happen, so the third option
> is that the open source community does it. Probably it would require a
> sufficiently committed individual developer to write the core as a
> proof of concept, which will draw in other developers. Who better than
> Jon Harrop, one of the most outspoken proponents of OCaml in the
> English-speaking world? ;-)

I was afraid you would say that. ;-)

Actually, if I do this then I'm not sure if it would be an open-source project
or a commercial product. Perhaps an IDE to provide a development environment
for the ocamljava project would be best. I haven't managed to get that
working yet but it looks to be by far the most interesting of the ones I've
seen.

I'll keep my eyes peeled. Thanks!

Steve Lianoglou

unread,
Nov 3, 2007, 10:36:19โ€ฏAM11/3/07
to JVM Languages
Hi Jon,

> OCaml is much less
> interesting for web programming and much more interesting for technical users
> like scientists. I believe great things could be done with an ML targetting
> the JVM.

I believe you're actively looking at/working with scala as well, so
I'm just wondering what (in your opinion) an OCaml-on-the-jvm
implementation would add that can't currently be addressed by Scala
(except for being able to run real/already-written Ocaml stuff on the
JVM).

In other words, what's scala missing that has you pining for OCaml?

Just curious (from a person who hasn't done much work in either),

-steve

Jim White

unread,
Nov 3, 2007, 10:58:19โ€ฏPM11/3/07
to jvm-la...@googlegroups.com
Matt Hellige wrote:

It's not all that cynical. F# *was* ML for the JVM before the team
(Persimmon IT) was bought by M$.

http://www.dcs.ed.ac.uk/home/mlj/

Since MLj is GPL, it would be great if some smart folks would dust it off!

Jim

Jon Harrop

unread,
Nov 4, 2007, 12:27:04โ€ฏAM11/4/07
to jvm-la...@googlegroups.com

OCaml and F# have shown that ML's approach to structured programming using
modules, variant types and pattern matching and extensive type inference is
almost always preferable to OOP. When given the choice between OOP and FP,
seasoned programmers rarely choose OOP.

However, Scala's design ignores many of the lessons learned over the past 35
years from the ML family of languages and, instead, panders to OOP and tries
to integrate OOP with pattern matching. While this might be of academic
interest, it is of little practical relevance because the resulting language
fails to capture the benefits of ML (let alone the benefits added by OCaml
and F#).

OCaml programmers are typically over 10x as productive as Java or C++
programmers for a wide range of practical tasks. Despite being based upon a
fundamentally OOP platform, F# goes a long way to capturing the productivity-
boosting benefits of OCaml (and the whole ML family). In contrast, Scala
fails to capture many of the benefits including some really basic ones and,
consequently, writing correct code is much more difficult in Scala than in
any real ML.

Moreover, the ML family of languages are designed to be succinct but Scala is
needlessly verbose for everything from "Hello world!" upwards. The ML family
of languages provide extensive type inference (OCaml more than most) but
Scala has only rudimentary inference by comparison. OCaml has an unusually
expressive type system but Scala adds little to OOP that is of practical
importance.

For example, here is "Hello world!" in F#:

printf "Hello world!"

and in Scala:

object HelloWorld {
def main(args: Array[String]) {
println("Hello world!")
}
}

Here is a simple 5-line derivative function in OCaml:

# let rec d f x = match f with
| `Var y when x=y -> `Int 1
| `Int _ | `Var _ -> `Int 0
| `Add(f, g) -> `Add(d f x, d g x)
| `Mul(f, g) -> `Add(`Mul(f, d g x), `Mul(g, d f x));;
val d :
([< `Add of 'a * 'a | `Int of 'b | `Mul of 'a * 'a | `Var of 'c ] as 'a) ->
'c -> ([> `Add of 'd * 'd | `Int of int | `Mul of 'a * 'd ] as 'd) = <fun>

OCaml inferred the sum type and checks the pattern match for exhaustiveness
and redundancy. Type inference has even proven that the `Var construct will
never survive an application of this derivative function "d".

Scala cannot infer the sum type used to represent a symbolic expression so you
must start by declaring it explicitly:

abstract class Expr
case class Lit(f: Int) extends Expr
case class Var[T](x: T) extends Expr
case class Add(f: Expr, g: Expr) extends Expr
case class Mul(f: Expr, g: Expr) extends Expr

Note that the declaration of the type is as big as the entire OCaml solution.
This declaration contains numerous superfluous keywords such
as "abstract", "case", "class", "extends" and the name of the ABC "Expr".

In Scala, the derivative function becomes another 6 lines of code:

def d[T](f: Expr)(x: T): Expr = f match {
case Var(y) => if (x==y) (Lit(1)) else (Lit(0))
case Lit(_) => Lit(0)
case Add(f, g) => Add(d(f)(x), d(g)(x))
case Mul(f, g) => Add(Mul(f, d(g)(x)), Mul(g, d(f)(x)))
}

The types of the arguments must be explicitly declared. You can sometimes omit
the return type but in this case Scala is unable to infer it so you must also
declare that explicitly.

Static checking for exhaustiveness and redundancy of such pattern matches is a
critical feature in all other modern FPLs. OCaml is particularly clever is
this respect:

# function
| 0 -> true
| 1 -> false;;
Warning P: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
2
- : int -> bool = <fun>

Note how OCaml not only warns you that the pattern match is not exhaustive but
even gives you an example of a value that is not handled.

This static checking becomes invaluable as more sophisticated pattern matches
are used. Consider:

# let foo = function
| true, _ -> 0
| _, false -> 1;;
Warning P: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
(false, true)
val foo : bool * bool -> int = <fun>

OCaml correctly identifies the only value that is not handled. In constrast,
Scala provides no such assurance, no warning, and just silently compiles this
erroneous code:

scala> def foo(b: (Boolean, Boolean)): Int = b match {
| case (true, _) => 0
| case (_, false) => 0
| }
foo: ((Boolean, Boolean))Int

scala> foo(false, true)
scala.MatchError: (false,true)
at .foo(<console>:4)
at .<init>(<console>:5)
at .<clinit>(<console>)
at java.lang.Class.initializeClass(libgcj.so.81)
at RequestResult$.<init>(<console>:3)
at RequestResult$.<clinit>(<console>)
at java.lang.Class.initializeClass(libgcj.so.81)
at RequestResult$result(<console>)
at java.lang.reflect.Method.invoke(libgcj.so.81)
...

With the latest version of Scala you can add "sealed" to your abstract class
to declare that it is a closed sum type (like an ordinary ML variant):

object E {
sealed abstract class Expr
case class Lit(f: Int) extends Expr
case class Var[T](x: T) extends Expr
case class Add(f: Expr, g: Expr) extends Expr
case class Mul(f: Expr, g: Expr) extends Expr
}

in which case the Scala compiler will try to check patterns for exhaustiveness
of redundancy but this is only available for sum types and not all types as
all MLs do.

So, while Scala is unquestionably an improvement over Java, I do not consider
it to be a front-runner among modern functional programming languages.

davewebb

unread,
Nov 4, 2007, 1:44:48โ€ฏPM11/4/07
to JVM Languages

>
> So, while Scala is unquestionably an improvement over Java, I do not consider
> it to be a front-runner among modern functional programming languages.
>

Just to put this in context, I don't think that Scala claims to be a
'modern functional programming language' as such. It's a multi-
paradigm language which "smoothly integrates features of object-
oriented and functional languages". As far as I can see, Scala is
paying homage to OCaml, Haskell etc by injecting functional
programming capability into a language that integrates well with Java
(and Java programmers), rather than competing with them.

Regards,
Dave Webb

Steve Lianoglou

unread,
Nov 4, 2007, 9:34:44โ€ฏPM11/4/07
to JVM Languages
Hi Jon,

While it's going to take me a while to make heads or tails of your
entire post, I just wanted to thank you for taking the time to write
such a detailed response.

Thanks!

-steve

Jon Harrop

unread,
Nov 4, 2007, 10:27:11โ€ฏPM11/4/07
to jvm-la...@googlegroups.com

Not at all. I described the major points but there are still many other
aspects of Scala that I am undecided on. Scala, OCaml and F# all take
different approaches to the rich syntax of user-defined operators,
particularly infix operators. The Scala approach of using member functions
called "+" is quite alien to me and it involves some trade-offs. After almost
a year of working full-time on F#, I am still undecided on whether or not
mixing C#-style operator overloading and ML-style type inference in F# is a
good thing. In the end, there may be more aspects of Scala that I consider to
be sub-optimal design decisions.

There are also practical problems when migrating from a stand-alone FPL like
OCaml to one hosted by an existing platform like .NET or the JVM. Stand-alone
FPLs like OCaml and MLton are fantastically efficient but .NET and the JVM
are optimized for heavily-imperative code. Consequently, allocation often
makes functional F# programs run 5x slower than equivalent OCaml, exceptions
are 600x slower in F# than in OCaml (and they are a core part of the CAML
family of languages, used extensively in ordinary non-exceptional code:
OCaml's exceptions are very fast, around 6x faster than C++!) and marshalling
in F# has been 1,000x slower for me.

I honestly thought that I could grok F# from my OCaml background in only a few
months but a year later and I am still battling with all of these trade-offs,
design decisions and I have barely dented the issue of libraries, active web
pages, Windows Presentation Foundation and so forth.

Following a discussion on Lambda the Ultimate, I am working on an article
comparing and contrasting the OCaml and F# languages in detail. I'll let you
know when I make the article freely available on our site. I think many of
the points will be very relevant for people trying to build modern languages
for the JVM.

There is one thing I am sure of: the future is bright for functional
programming!

davewebb

unread,
Nov 5, 2007, 11:28:28โ€ฏAM11/5/07
to JVM Languages

> In the end, there may be more aspects of Scala that I consider to
> be sub-optimal design decisions.
>

Jon,

By sub-optimal design decisions, do you mean sub-optimal in the
context of what Scala is actually trying to be, or sub-optimal in the
context of what you'd like Scala to be? I can understand that someone
coming to Scala from an OCaml background might be missing some of the
aspects of OCaml and choose to stay clear of Scala on that basis (and
it's a good education for me to see where OCaml can express things
more elegantly - thanks). But the Scala team aren't trying to create a
new OCaml. They're trying to create a language that integrates well
into the Java ecosystem with a lot of functional features. The phrase
'sub-optimal' has got to be qualified with a context.

Regards,
Dave Webb

hlovatt

unread,
Nov 7, 2007, 12:21:38โ€ฏAM11/7/07
to JVM Languages
In functional languages, I am most familiar with Haskell, you tend to
have a closed world view of classes. Taking Jon's example:

> # let foo = function
> | true, _ -> 0
> | _, false -> 1;;

Here there are two classes derived from Bool, true and false, but if
you wanted to now add undefined, as often needed in database
applications for example, then you would not only have to change Bool
but also every function that uses Bool. Therefore you would need the
source for all libraries, in case they used Bool. This tends to be
limiting in larger applications and makes testing difficult. Imagine
in Java, if for every interface there was a fixed set of classes
derived from that interface and the user of a library could not derive
from an interface defined in the library! EG you couldn't write your
own Runnable, you could only have pre-defined ones.

The Scala solution of allowing derived classes therefore has some
merit, but it has the drawback of poor error messages. As an
alternative I have implemented a compiler, called Pattern Enforcing
Compiler (PEC), that does multiple dispatch (plus other patterns):

http://pec.dev.java.net/nonav/compile/index.html

The compiler has no new syntax, instead it uses marker interfaces. The
example in PEC is:

public interface Bool extends MultipleDispatch {
int foo( Bool other ); // method has multiple dispatch semantics
}

public abstract class AbstractBool implements Bool {
public final int foo( Bool other ) { throw new
AssertionError(); } // method body replaced by compiler with
dispatcher
}

public class True extends AbstractBool {
public static final int foo( final True t, final Bool b ) { return
0; } // could be in *any* class
}

public class False extends AbstractBool {
public static final int foo( final Bool b, final False f ) { return
0; } // could be in *any* class
}

You get no error message at compile time because it is legitimate to
compile parts separately, the missing method could be written by the
user of the library for example. When Bool etc. are loaded (and you
haven't provided foo( True, False)) you get the following error:

pec.compile.multipledispatch.MultipleDispatchException: Cannot resolve
which method to call. Either
fsharpexample.False.foo( fsharpexample.Bool, fsharpexample.False )
{depth = 2} or fsharpexample.True.foo( fsharpexample.True,
fsharpexample.Bool ){depth = 2} match when arguments are [class
fsharpexample.True, class fsharpexample.False]. Add a method with
these arguments to resolve the conflict.

As you can see this error is similar to ML, including supplying an
example.

Therefore multiple dispatch provides the best of both worlds, ML and
Scala, and is already implemented by PEC on the JVM. The
implementations of ML and Scala on the JVM could be changed to work
like PEC. Further the proposed new bytcode, invokedynamic, might make
writing the dynamic dispatcher easier (I wait with anticipation for
the details of invokedynamic!).

Neil Bartlett

unread,
Nov 7, 2007, 3:47:59โ€ฏAM11/7/07
to jvm-la...@googlegroups.com
Jon,

It looks like your plea may have been answered. The following popped
up on reddit this morning:

"Cafesterol is an extension of the Objective Caml compiler suite that
generates Java bytecode. Cafesterol provides an ocamljava compiler
that is the Java counterpart of ocamlc/ocamlopt compilers distributed
with the Objective Caml standard distribution. Cafesterol, in its 1.0
alpha version builds with the 3.10.0 version of Objective Caml. The
produced Java classes need the 1.0 alpha version of Cadmium to run and
can be executed on any Java 1.5 virtual machine."

http://cafesterol.x9c.fr/

Regards,
Neil

Patrick Wright

unread,
Nov 7, 2007, 4:15:01โ€ฏAM11/7/07
to jvm-la...@googlegroups.com
> http://cafesterol.x9c.fr/

Great name. Made my day. Is that good or bad cafesterol?

The PDF for Cafesterol includes a section at the end, "Compatibility
with the standard compilers". I won't repeat the items here, but it
seems like useful info.

http://cafesterol.x9c.fr/distrib/cafesterol.pdf


Patrick

Szegedi Attila

unread,
Nov 7, 2007, 4:31:25โ€ฏAM11/7/07
to jvm-la...@googlegroups.com

On 2007.11.07., at 10:15, Patrick Wright wrote:

>
>> http://cafesterol.x9c.fr/
>
> Great name. Made my day. Is that good or bad cafesterol?
>

Yep, also there's Barista: "Library for Java class ๏ฌle manipulation
โ€“ available at http://barista.x9c.fr" :-)

Attila.

Jon Harrop

unread,
Nov 7, 2007, 4:58:01โ€ฏAM11/7/07
to jvm-la...@googlegroups.com
On Wednesday 07 November 2007 08:47, Neil Bartlett wrote:
> Jon,
>
> It looks like your plea may have been answered. The following popped
> up on reddit this morning:
>
> "Cafesterol is an extension of the Objective Caml compiler suite that
> generates Java bytecode. Cafesterol provides an ocamljava compiler
> that is the Java counterpart of ocamlc/ocamlopt compilers distributed
> with the Objective Caml standard distribution. Cafesterol, in its 1.0
> alpha version builds with the 3.10.0 version of Objective Caml. The
> produced Java classes need the 1.0 alpha version of Cadmium to run and
> can be executed on any Java 1.5 virtual machine."
>
> http://cafesterol.x9c.fr/

Indeed, this looks extremely interesting.

Jon Harrop

unread,
Nov 7, 2007, 5:00:45โ€ฏAM11/7/07
to jvm-la...@googlegroups.com
On Wednesday 07 November 2007 05:21, hlovatt wrote:
> Therefore multiple dispatch provides the best of both worlds...

Does multiple dispatch provide static checking of exhaustiveness and
redundancy, and optimizing pattern match compilation via decision trees with
the usual pattern matching features (or-patterns, guards and so on)?

Also, what are the source code equivalents to the examples I posted?

David MacIver

unread,
Nov 9, 2007, 6:48:35โ€ฏAM11/9/07
to jvm-la...@googlegroups.com
On Nov 7, 2007 10:00 AM, Jon Harrop <j...@ffconsultancy.com> wrote:
>
> On Wednesday 07 November 2007 05:21, hlovatt wrote:
> > Therefore multiple dispatch provides the best of both worlds...

No one seems to have replied, so I thought I would. IANAExpert
disclaimers here. :-) Hopefully this will inspire someone who is to
clarify...

> Does multiple dispatch provide static checking of exhaustiveness and

It can do. Take a look at the ML-sub type system for an example of an
extension of Hindley-Milner with subtyping and multiple dispatch. The
JVM language Nice (almost) implements an extensions of this system. I
don't remember the exact mechanism which it uses to do so, but I seem
to recall it working pretty well.

On the other hand, most multiple dispatch implementations are lispy so don't.

> redundancy, and optimizing pattern match compilation via decision trees with


Certainly. There are some really heavily optimised CLOS implementations.

> the usual pattern matching features (or-patterns, guards and so on)?

I don't know.

> Also, what are the source code equivalents to the examples I posted?

I'd need to think about it. I've not written any Nice in way too long.
It would probably be somewhere in between the two in terms of
verbosity. In particular there are no implicit sum types. You'd need
to define an abstract interface (similar to a Haskell type class) or
something.

hlovatt

unread,
Nov 9, 2007, 11:33:07โ€ฏPM11/9/07
to JVM Languages
> Does multiple dispatch provide static checking of exhaustiveness and
> redundancy, and optimizing pattern match compilation via decision trees with
> the usual pattern matching features (or-patterns, guards and so on)?

Yes. My PEC implementation doesn't provide guards, but in principle
multiple dispatch can.

> Also, what are the source code equivalents to the examples I posted?

The Bool example is already posted. The Exp example in PEC is:

public interface Exp extends MultipleDispatch { Exp d( Var x ); }

public abstract class AbstractExp implements Exp {
public final Exp d( Var x ) { return null; } // Body replaced by
PEC!
public final static Exp d( Var v, Var x ) { return v == x ? new
Lit( 1 ) : new Lit( 0 ); }
public final static Exp d( Lit l, Var x ) { return new Lit( 0 ); }
public final static Exp d( Add a, Var x ) { return new
Add( a.left.d( x ), a.right.d( x ) ); }
public final static Exp d( Mul m, Var x ) { return new Add( new
Mul( m.left, m.right.d( x ) ), new Mul( m.right, m.left.d( x ) ) ); }
}

public class Var extends AbstractExp {
public final String name;
public Var( String name ) { this.name = name; }
public String toString() { return name; }
}

public class Lit extends AbstractExp {
public final double value;
public Lit( final double value ) { this.value = value; }
public String toString() { return Double.toString( value ); }
}

public class Add extends AbstractExp {
public final Exp left;
public final Exp right;
public Add( Exp left, Exp right ) {
this.left = left;
this.right = right;
}
public String toString() { return left + " + " + right; }
}

public class Mul extends AbstractExp {
public final Exp left;
public final Exp right;
public Mul( Exp left, Exp right ) {
this.left = left;
this.right = right;
}
public String toString() { return left + " * " + right; }
}

One aspect of the syntax that I like with multiple dispatch is that
you write overloaded functions rather than a switch statement, which
reads better to my eyes.

Jon Harrop

unread,
Nov 10, 2007, 8:33:06โ€ฏAM11/10/07
to jvm-la...@googlegroups.com

I cannot see why you would prefer the above 40 lines of code to the 5 line
equivalent in OCaml:

let rec d f x = match f with
ย  ย  | `Var y when x=y -> `Int 1
ย  ย  | `Int _ | `Var _ -> `Int 0
ย  ย  | `Add(f, g) -> `Add(d f x, d g x)
ย  ย  | `Mul(f, g) -> `Add(`Mul(f, d g x), `Mul(g, d f x))

You have added a toString() function. However, it is broken: you need to
bracket according to precedence. Doing this correctly is actually quite an
interesting exercise as well. Here is an OCaml implementation:

open Printf;;
let rec string_of () = function
| `Var x -> x
| `Int n -> sprintf "%d" n
| `Add(f, g) -> sprintf "%a + %a" string_of f string_of g
| `Mul(f, g) -> sprintf "%a %a" string_of_mul f string_of_mul g
and string_of_mul () = function
| `Add(f, g) as e -> sprintf "(%a)" string_of e
| e -> string_of () e;;

How does that translate?

> One aspect of the syntax that I like with multiple dispatch is that
> you write overloaded functions rather than a switch statement, which
> reads better to my eyes.

I suspect the bloat caused incurred by the approach that you are advocating is
even worse though: more complicated patterns (particularly parallel patterns
and deep patterns) will probably require asymptotically more code.

Consider this simplifying symbolic multiplication function, for example:

let rec ( *: ) f g = match f, g with
`Int n, `Int m -> `Int(n * m)
| f, `Int 0 | `Int 0, f -> `Int 0
| f, `Int 1 | `Int 1, f -> f
| f, `Mul(g, h) -> f *: g *: h
| f, g -> `Mul(f, g)

How does that translate? Particularly the associativity match case?

David MacIver

unread,
Nov 10, 2007, 12:24:00โ€ฏPM11/10/07
to jvm-la...@googlegroups.com

I think you're conflating two issues. Yes, OCaml's variants are quite
nice (or at least they look it. I only know the subset of OCaml that
looks behaves more or less like SML), but they're neither neccessary
nor sufficient for pattern matching. A fairer comparison would be
between multiple dispatch and pattern matching on explicitly named
types.

Also, with all due respect to PEC, it's not really a good example of
multiple dispatch - it's Java with multiple dispatch tacked on. So,
here's the corresponding Nice code:

package expression;

// We don't need to declare this, but it will help for type safety.
interface Expr<!T> { }

// Expression types. Note that we do have to declare fields. It would be neat if
// we could extend tuples and get the advantages of destructuring bind
on top of
// multiple dispatch, but Nice doesn't allow that.
class Add<!T> implements Expr<!T>{
Expr<!T> fst;
Expr<!T> snd;
}

class Mult<!T> implements Expr<!T>{
Expr<!T> fst;
Expr<!T> snd;
}

class Int<!T> implements Expr<!T>{
int i;
}

class Var<!T> implements Expr<!T>{
T t;
}

// Syntactic nicety
<!T> Expr<!T> `+` (Expr<!T> fst, Expr<!T> snd) = new Add(fst : fst, snd : snd);
<!T> Expr<!T> `*` (Expr<!T> fst, Expr<!T> snd) = new Mult(fst : fst, snd : snd);
<!T> Expr<!T> int(int i) = new Int(i : i);

// Differentiation of expressions.
// I swear there's a way to omit both the 'override' and the return
types for the overridden methods
// but I can't figure out how the syntax interacts with declaring type
parameters. :-/
<!T> Expr<!T> diff(Expr<!T> v, T x);
override <!T> Expr<!T> diff(Var<!T> v, T x) = int(x == v.t ? 1 : 0);
override <!T> Expr<!T> diff(Int<!T> v, T x) = new Int(i : 0);
override <!T> Expr<!T> diff(Mult<!T> v, T x) = v.fst * diff(v.snd, x)
+ diff(v.fst, x) * v.snd;
override <!T> Expr<!T> diff(Add<!T> v, T x) = diff(v.fst, x) + diff(v.snd, x);

> let rec d f x = match f with
> | `Var y when x=y -> `Int 1
> | `Int _ | `Var _ -> `Int 0
> | `Add(f, g) -> `Add(d f x, d g x)
> | `Mul(f, g) -> `Add(`Mul(f, d g x), `Mul(g, d f x))
>
> You have added a toString() function. However, it is broken: you need to
> bracket according to precedence. Doing this correctly is actually quite an
> interesting exercise as well. Here is an OCaml implementation:
>
> open Printf;;
> let rec string_of () = function
> | `Var x -> x
> | `Int n -> sprintf "%d" n
> | `Add(f, g) -> sprintf "%a + %a" string_of f string_of g
> | `Mul(f, g) -> sprintf "%a %a" string_of_mul f string_of_mul g
> and string_of_mul () = function
> | `Add(f, g) as e -> sprintf "(%a)" string_of e
> | e -> string_of () e;;
>
> How does that translate?

// I'm pretty sure it should be possible to declare f inside the
toString method for Mult (which is the only place it's used),
// but once again I can't figure out the syntax.
<!T> String f(Expr<!T> t) = toString(t);
override <!T> String f(Add<!T> t) = "(" toString(t) ")";

<!T> String toString (Int<!T> i) = toString(i.i);
<!T> String toString (Var<!T> v) = toString(v.t);
<!T> String toString (Mult<!T> m) = f(m.fst) + f(m.snd);

Some things of note:

a) There are various language problems here, and various places where
it's clumsier than it should be. These are mostly unrelated to the
multiple dispatch. Some of them are Nice being clumsy, some of them
are me being really rusty at the language.
b) Nice doesn't do destructuring bind of constructors, but there's
absolutely no reason why it couldn't. It's just syntax. If it did then
the multiple dispatch declarations would look *much* closer to the
pattern matching.
c) Once we've declared the types, the only additional code over the
ocaml version is the need to include some type declarations it omits.

Philosophically speaking, pattern matching is extremely close to a
more limited form of multiple dispatch + destructuring bind of
constructors. But because it's more limited (and because a great deal
more work has been put into making it good!) the compiler can infer
more information in the statically typed case.

> > One aspect of the syntax that I like with multiple dispatch is that
> > you write overloaded functions rather than a switch statement, which
> > reads better to my eyes.
>
> I suspect the bloat caused incurred by the approach that you are advocating is
> even worse though: more complicated patterns (particularly parallel patterns
> and deep patterns) will probably require asymptotically more code.
>
> Consider this simplifying symbolic multiplication function, for example:
>
> let rec ( *: ) f g = match f, g with
> `Int n, `Int m -> `Int(n * m)
> | f, `Int 0 | `Int 0, f -> `Int 0
> | f, `Int 1 | `Int 1, f -> f
> | f, `Mul(g, h) -> f *: g *: h
> | f, g -> `Mul(f, g)
>
> How does that translate? Particularly the associativity match case?

This one indeed *doesn't* translate as well as I'd like. This is
basically because of the inability for Nice to chain the dispatch
rules nicely.

In particular there's no easy equivalent of the combining patterns,
and although we can dispatch on 0 and 1 specially this doesn't lift to
let us dispatch on Int 0 and Int 1 specially. I'm not sure why you
thought the associativity would be difficult though - that works out
fine.

// Multiplication
<!T> Expr<T> `*` (int i, Expr<T> x) = new Mult(int(i), x);
override <!T> Expr<T> `*` (int i, Int<T> x) = int(i * x.i);
override <!T> Expr<T> `*` (0, x) = int(0);
override <!T> Expr<T> `*` (int 1, x) = x;

<!T> Expr<T> `*` (Expr<T> x, int i) = new Mult(x, int i);
override <!T> Expr<T> `*` (Int<T> x, int i) = int(i * x.i);
override <!T> Expr<T> `*` (x, int 0) = int(0);
override <!T> Expr<T> `*` (x, int 1) = x;

override <!T> Expr<T> `*` (Int<T> i, Expr<T> x) = i.i * x;
override <!T> Expr<T> `*` (Expr<T> x, Int<T> i) = i.i * x;
override <!T> Expr<T> `*` (Expr<T> x, Mult<T> y) = (x * y.fst) * y.snd;

As you can see, it involves a bunch of repetition and some annoying hacks.

Don't get me wrong - I agree that the OCaml code is better in places,
and there's not much in the Nice code which I can point to as better
(at least nothing relevant). Particularly the multiplication example.
There are various limitations on the Nice dispatch system which make
it hard to beat a good pattern matching system in this instance, and
the lack of destructuring bind is a real nuisance. But hopefully these
examples show that the basic facility to do what you want is there,
and none of these problems are inherent to the idea of multiple
dispatch. Additionally, I don't think the OCaml code is *much* better.

There are a few problems which will always be there - it's much harder
to infer things in the presence of full blown nominal subtyping for
example. But on the other hand there are a lot of things which
multiple dispatch makes noticably easier as well. There are tradeoffs,
but I don't think either approach is obviously better than the other.

Jon Harrop

unread,
Nov 10, 2007, 3:42:05โ€ฏPM11/10/07
to jvm-la...@googlegroups.com
On Saturday 10 November 2007 17:24, David MacIver wrote:
> > I cannot see why you would prefer the above 40 lines of code to the 5
> > line equivalent in OCaml:
>
> I think you're conflating two issues. Yes, OCaml's variants are quite
> nice (or at least they look it. I only know the subset of OCaml that
> looks behaves more or less like SML), but they're neither neccessary
> nor sufficient for pattern matching. A fairer comparison would be
> between multiple dispatch and pattern matching on explicitly named
> types.

That is certainly a different comparison but I do not understand why you would
consider it "fairer".

Regardless, here is the OCaml with an explicitly-defined closed sum type:

type expr =
| Var of string
| Int of int
| Add of expr * expr
| Mul of expr * expr

let rec d f x = match f with
| Var y when x=y -> Int 1
| Int _ | Var _ -> Int 0
| Add(f, g) -> Add(d f x, d g x)

| Mul(f, g) -> Add(Mul(f, d g x), Mul(g, d f x))

> Also, with all due respect to PEC, it's not really a good example of
> multiple dispatch - it's Java with multiple dispatch tacked on. So,
> here's the corresponding Nice code:
>
> package expression;
>

> interface Expr<!T> { }


>
> class Add<!T> implements Expr<!T>{
> Expr<!T> fst;
> Expr<!T> snd;
> }
>
> class Mult<!T> implements Expr<!T>{
> Expr<!T> fst;
> Expr<!T> snd;
> }
>
> class Int<!T> implements Expr<!T>{
> int i;
> }
>
> class Var<!T> implements Expr<!T>{
> T t;
> }
>

> <!T> Expr<!T> `+` (Expr<!T> fst, Expr<!T> snd) = new Add(fst : fst, snd :
> snd); <!T> Expr<!T> `*` (Expr<!T> fst, Expr<!T> snd) = new Mult(fst : fst,
> snd : snd); <!T> Expr<!T> int(int i) = new Int(i : i);
>

> <!T> Expr<!T> diff(Expr<!T> v, T x);
> override <!T> Expr<!T> diff(Var<!T> v, T x) = int(x == v.t ? 1 : 0);
> override <!T> Expr<!T> diff(Int<!T> v, T x) = new Int(i : 0);
> override <!T> Expr<!T> diff(Mult<!T> v, T x) = v.fst * diff(v.snd, x)
> + diff(v.fst, x) * v.snd;
> override <!T> Expr<!T> diff(Add<!T> v, T x) = diff(v.fst, x) + diff(v.snd,
> x);

That still looks like a very long-winded way of saying the same thing to me.
You've got a lot of explicit type declarations and lots of explicit
parameterization along with a bunch of superfluous keywords (package,
interface, class, implements, override etc.).

> > open Printf;;
> > let rec string_of () = function
> >
> > | `Var x -> x
> > | `Int n -> sprintf "%d" n
> > | `Add(f, g) -> sprintf "%a + %a" string_of f string_of g
> > | `Mul(f, g) -> sprintf "%a %a" string_of_mul f string_of_mul g
> >
> > and string_of_mul () = function
> >
> > | `Add(f, g) as e -> sprintf "(%a)" string_of e
> > | e -> string_of () e;;
> >
> > How does that translate?
>

> <!T> String f(Expr<!T> t) = toString(t);
> override <!T> String f(Add<!T> t) = "(" toString(t) ")";
>
> <!T> String toString (Int<!T> i) = toString(i.i);
> <!T> String toString (Var<!T> v) = toString(v.t);
> <!T> String toString (Mult<!T> m) = f(m.fst) + f(m.snd);
>
> Some things of note:
>
> a) There are various language problems here, and various places where
> it's clumsier than it should be. These are mostly unrelated to the
> multiple dispatch. Some of them are Nice being clumsy, some of them
> are me being really rusty at the language.
> b) Nice doesn't do destructuring bind of constructors, but there's
> absolutely no reason why it couldn't. It's just syntax. If it did then
> the multiple dispatch declarations would look *much* closer to the
> pattern matching.

In some cases (tuples), yes. I'm not sure that generalizes though. For
example, the following pattern matches any linear expression in a
variable "x" with constants (either numbers or variables) bound to "m"
and "c":

let linear_coefficients = function
| `Add((`Mul((`Int _ | `Var _ as m), `Var "x") |
`Mul(`Var "x", (`Int _ | `Var _ as m))),
(`Int _ | `Var _ as c))
| `Add((`Int _ | `Var _ as c),
(`Mul((`Int _ | `Var _ as m), `Var "x") |
`Mul(`Var "x", (`Int _ | `Var _ as m)))) when m <> `Var "x" ->
Some(m, c)
| _ -> None;;

I suspect the multiple dispatch equivalent is enormously verbose.

> c) Once we've declared the types, the only additional code over the
> ocaml version is the need to include some type declarations it omits.

No, you've still had to compile the pattern match by hand. Look at the
catchall clause in my OCaml string_of_mul function, for example:

| e -> string_of () e

That translated to:

<!T> String toString (Int<!T> i) = toString(i.i);
<!T> String toString (Var<!T> v) = toString(v.t);
<!T> String toString (Mult<!T> m) = f(m.fst) + f(m.snd);

So you had to handle every other constructor individually by hand. That is
asymptotically more code.

Finally, I suspect you cannot "include some type declarations it omits"
because I'm using polymorphic variants. The OO equivalent would be something
like class hierarchies with multiple inheritance from some other
(polymorphic) classes. You might be able to implement that in C++ using
templates but I doubt many OO languages could represent it without
sacrificing static checking.

> Philosophically speaking, pattern matching is extremely close to a
> more limited form of multiple dispatch + destructuring bind of
> constructors. But because it's more limited (and because a great deal
> more work has been put into making it good!) the compiler can infer
> more information in the statically typed case.

That doesn't make sense to me. You're saying that multiple dispatch could be
more competitive with pattern matching if an implementation provided a
variety of extra features but none do. Can you give concise pseudo-code
assuming that there was a language that implemented the extra features that
you need?

From what I've seen, the multiple dispatch approach seems to be significantly
worse (i.e. several times as much code) at handling a small subset of
patterns (shallow and enumerated constructors) but it doesn't look as if it
will scale to deep or more sophisticated (catchall, guarded or or-patterns
etc.).

Also, how do you distinguish between open and closed sum types and how does it
handle overlapping sum types?

I guess I'm missing a killer example that uses multiple dispatch to achieve
something that cannot be achieved so easily with pattern matching. Do you
have such an example?

> > let rec ( *: ) f g = match f, g with
> > `Int n, `Int m -> `Int(n * m)
> >
> > | f, `Int 0 | `Int 0, f -> `Int 0
> > | f, `Int 1 | `Int 1, f -> f
> > | f, `Mul(g, h) -> f *: g *: h
> > | f, g -> `Mul(f, g)
>

> This one indeed *doesn't* translate as well as I'd like. This is
> basically because of the inability for Nice to chain the dispatch
> rules nicely.
>
> In particular there's no easy equivalent of the combining patterns,
> and although we can dispatch on 0 and 1 specially this doesn't lift to
> let us dispatch on Int 0 and Int 1 specially.

Right, that's because the pattern is one deeper than before. Patterns can be
arbitrarily deep, of course, and I think you will have to invent a new
function for every depth of every pattern in general.

> I'm not sure why you thought the associativity would be difficult though -
> that works out fine.

Again, this is because you've manually handled a 1-deep pattern by manually
calling fst and snd to deconstruct it. If you want to deconstruct deeper then
I think you'll have to resort to other techniques.

That means you can handle this pattern elegantly:

Some a, Some b, Some c, Some d -> Some(a, b, c, d)

but not this one:

Some(a, Some(b, (Some c, Some d))) -> Some(a, b, c, d)

> // Multiplication
> <!T> Expr<T> `*` (int i, Expr<T> x) = new Mult(int(i), x);
> override <!T> Expr<T> `*` (int i, Int<T> x) = int(i * x.i);
> override <!T> Expr<T> `*` (0, x) = int(0);
> override <!T> Expr<T> `*` (int 1, x) = x;
>
> <!T> Expr<T> `*` (Expr<T> x, int i) = new Mult(x, int i);
> override <!T> Expr<T> `*` (Int<T> x, int i) = int(i * x.i);
> override <!T> Expr<T> `*` (x, int 0) = int(0);
> override <!T> Expr<T> `*` (x, int 1) = x;
>
> override <!T> Expr<T> `*` (Int<T> i, Expr<T> x) = i.i * x;
> override <!T> Expr<T> `*` (Expr<T> x, Int<T> i) = i.i * x;
> override <!T> Expr<T> `*` (Expr<T> x, Mult<T> y) = (x * y.fst) * y.snd;

Which bit gets called when you multiply a Var by a Var?

> As you can see, it involves a bunch of repetition and some annoying hacks.

Yes. Also, the use of overloading conflicts with type inference so you are
forced to explicitly annotate all of the types yourself. Even if you could
add features to the language to help, I do not see a way around that.

> Don't get me wrong - I agree that the OCaml code is better in places,
> and there's not much in the Nice code which I can point to as better
> (at least nothing relevant). Particularly the multiplication example.
> There are various limitations on the Nice dispatch system which make
> it hard to beat a good pattern matching system in this instance, and
> the lack of destructuring bind is a real nuisance. But hopefully these
> examples show that the basic facility to do what you want is there,
> and none of these problems are inherent to the idea of multiple
> dispatch. Additionally, I don't think the OCaml code is *much* better.

OCaml certainly requires several times less code. IMHO, the aspect of OCaml
that we are discussing is one of its strongest points. I have never seen
another language come close to it here (including SML).

> There are a few problems which will always be there - it's much harder
> to infer things in the presence of full blown nominal subtyping for
> example. But on the other hand there are a lot of things which
> multiple dispatch makes noticably easier as well. There are tradeoffs,
> but I don't think either approach is obviously better than the other.

If you are not aware of modern pattern matchers (i.e. since SML) then I think
it is premature to state "there are a lot of things which multiple dispatch
makes noticably easier as well". Can you give any examples and I'll try to
translate them into OCaml/F#?

David MacIver

unread,
Nov 11, 2007, 6:39:42โ€ฏAM11/11/07
to jvm-la...@googlegroups.com
On Nov 10, 2007 8:42 PM, Jon Harrop <j...@ffconsultancy.com> wrote:
>
> On Saturday 10 November 2007 17:24, David MacIver wrote:
> > > I cannot see why you would prefer the above 40 lines of code to the 5
> > > line equivalent in OCaml:
> >
> > I think you're conflating two issues. Yes, OCaml's variants are quite
> > nice (or at least they look it. I only know the subset of OCaml that
> > looks behaves more or less like SML), but they're neither neccessary
> > nor sufficient for pattern matching. A fairer comparison would be
> > between multiple dispatch and pattern matching on explicitly named
> > types.
>
> That is certainly a different comparison but I do not understand why you would
> consider it "fairer".

Because a) Using explicitly named types has certain advantages, both
in terms of the programmer's usage and encoding invariants in the
program and b) Your claim that this is an advantage of the "ML family
of languages" then actually holds. Currently you're comparing a very
specific feature of OCaml (which its immediate descendant F# has
inherited).

> > package expression;
> >
> > interface Expr<!T> { }
> >
> > class Add<!T> implements Expr<!T>{
> > Expr<!T> fst;
> > Expr<!T> snd;
> > }
> >
> > class Mult<!T> implements Expr<!T>{
> > Expr<!T> fst;
> > Expr<!T> snd;
> > }
> >
> > class Int<!T> implements Expr<!T>{
> > int i;
> > }
> >
> > class Var<!T> implements Expr<!T>{
> > T t;
> > }
> >
> > <!T> Expr<!T> `+` (Expr<!T> fst, Expr<!T> snd) = new Add(fst : fst, snd :
> > snd); <!T> Expr<!T> `*` (Expr<!T> fst, Expr<!T> snd) = new Mult(fst : fst,
> > snd : snd); <!T> Expr<!T> int(int i) = new Int(i : i);
> >
> > <!T> Expr<!T> diff(Expr<!T> v, T x);
> > override <!T> Expr<!T> diff(Var<!T> v, T x) = int(x == v.t ? 1 : 0);
> > override <!T> Expr<!T> diff(Int<!T> v, T x) = new Int(i : 0);
> > override <!T> Expr<!T> diff(Mult<!T> v, T x) = v.fst * diff(v.snd, x)
> > + diff(v.fst, x) * v.snd;
> > override <!T> Expr<!T> diff(Add<!T> v, T x) = diff(v.fst, x) + diff(v.snd,
> > x);
>
> That still looks like a very long-winded way of saying the same thing to me.

Certainly. If you look at every part other than the bits which are
actually under discussion.

> You've got a lot of explicit type declarations and lots of explicit
> parameterization

Once again, you're focusing on trivia. I'm not trying to argue with
the claim "OCaml is the bestest language ever!". I'm sure as hell not
here to argue myself that Nice *is*. It's clearly not. I am pointing
out that multiple dispatch is not the horribly verbose beast you claim
it is and that it has certain strengths you're overlooking.

> along with a bunch of superfluous keywords (package,

Nice requires explicit packages for everything. This is a Java
interoperability issue.

> interface,

Is the equivalent of your 'type' keyword in the above.

>class, implements,

Admittedly redundant, but not that big a deal. Something terser than
'implements' would be good though.

>override etc.).

The syntax for Nice isn't very well thought out sometimes (it's
designed to be too Java like to be truly elegant), and the particular
issue here is that normally you can omit the return type of the
overriding function (and the 'override' declaration when you do), but
I couldn't figure out how to make that play well with the requirement
of explicit type parameter declarations. A better approach would be to
use some unambiguous notation for unbound type parameters. ('a, 'b,
'c, etc. would work for example). You can even omit it in this case,
but the compiler warns you if you do.

> > Some things of note:
> >
> > a) There are various language problems here, and various places where
> > it's clumsier than it should be. These are mostly unrelated to the
> > multiple dispatch. Some of them are Nice being clumsy, some of them
> > are me being really rusty at the language.
> > b) Nice doesn't do destructuring bind of constructors, but there's
> > absolutely no reason why it couldn't. It's just syntax. If it did then
> > the multiple dispatch declarations would look *much* closer to the
> > pattern matching.
> In some cases (tuples), yes. I'm not sure that generalizes though. For
> example, the following pattern matches any linear expression in a
> variable "x" with constants (either numbers or variables) bound to "m"
> and "c":
>
> let linear_coefficients = function
> | `Add((`Mul((`Int _ | `Var _ as m), `Var "x") |
> `Mul(`Var "x", (`Int _ | `Var _ as m))),
> (`Int _ | `Var _ as c))
> | `Add((`Int _ | `Var _ as c),
> (`Mul((`Int _ | `Var _ as m), `Var "x") |
> `Mul(`Var "x", (`Int _ | `Var _ as m)))) when m <> `Var "x" ->
> Some(m, c)
> | _ -> None;;
>
> I suspect the multiple dispatch equivalent is enormously verbose.

Yep, it probably is. I need to think about it. The or patterns don't
translate well.

> > c) Once we've declared the types, the only additional code over the
> > ocaml version is the need to include some type declarations it omits.
>
> No, you've still had to compile the pattern match by hand. Look at the
> catchall clause in my OCaml string_of_mul function, for example:
>
> | e -> string_of () e
>
> That translated to:
>
> <!T> String toString (Int<!T> i) = toString(i.i);
> <!T> String toString (Var<!T> v) = toString(v.t);
> <!T> String toString (Mult<!T> m) = f(m.fst) + f(m.snd);
>
> So you had to handle every other constructor individually by hand. That is
> asymptotically more code.

Either you've misunderstood the translation, or I've misunderstood
what your code is doing:

open Printf;;
let rec string_of () = function
| `Var x -> x
| `Int n -> sprintf "%d" n
| `Add(f, g) -> sprintf "%a + %a" string_of f string_of g
| `Mul(f, g) -> sprintf "%a %a" string_of_mul f string_of_mul g
and string_of_mul () = function
| `Add(f, g) as e -> sprintf "(%a)" string_of e
| e -> string_of () e;;

I see cases for all the constructors there. The three lines you're
objecting to are the translation of:

| `Int n -> sprintf "%d" n
| `Add(f, g) -> sprintf "%a + %a" string_of f string_of g
| `Mul(f, g) -> sprintf "%a %a" string_of_mul f string_of_mul g

Further, playing the line count game, I count 7 lines for declaring
string_of in your OCaml code (not even counting the import) and 5 for
my toString declaration.

> Finally, I suspect you cannot "include some type declarations it omits"
> because I'm using polymorphic variants. The OO equivalent would be something
> like class hierarchies with multiple inheritance from some other
> (polymorphic) classes. You might be able to implement that in C++ using
> templates but I doubt many OO languages could represent it without
> sacrificing static checking.

I'm sure that's the case. Most good OO languages are dynamically
typed. Scala is an exception to that, but I can't think of a good way
to do it there. You can encode generic sum types in Scala's pattern
matching without *too* much overhead, but it's pretty grim.

> > Philosophically speaking, pattern matching is extremely close to a
> > more limited form of multiple dispatch + destructuring bind of
> > constructors. But because it's more limited (and because a great deal
> > more work has been put into making it good!) the compiler can infer
> > more information in the statically typed case.
>
> That doesn't make sense to me. You're saying that multiple dispatch could be
> more competitive with pattern matching if an implementation provided a
> variety of extra features but none do.

I don't know where you extracted that "but none do" from. Nice
doesn't. I'm so far from an expert on multiple dispatch that it's not
funny, so I can't comment on what the state of the art in other
languages is (I did say I was the wrong person to be arguing this :-)
). I imagine CLOS provides some subset of the features that Nice
doesn't, but it's not statically checked.

Also, I retract the "philosophically speaking etc." statement. See later.

> Can you give concise pseudo-code
> assuming that there was a language that implemented the extra features that
> you need?

Here's a translation of the Nice code.

Changes I've made:

Syntactic: I've made type parameters implicit, declared with a leading
'. (this is to prevent namespacing clashes. I could just use Haskell's
lower case parameters instead I suppose, as I'm not really operating
under the same constraints as Nice). This has allowed me to drop the
override type declarations. I've also replaced implements with <:
(scala-style. It's also the notation used in the original paper behind
Nice if I recall correctly, but was dropped in the "Hey, wouldn't it
be really great if our language looked more like Java??"
foolhardiness)

I've added destructuring bind and replaced the Java-style explicit
fields with ML/Haskell style constructors for each type. This also
allows one to pass dispatch (including value dispatch) through the
binding nicely.

package expression;

interface Expr<'a> { }

// Changing from explicit fields to ML/Haskell style constructors, as
we now have the ability to extract these nicely.
class Add<'a> Expr<'a> Expr<'a> <: Expr<'a>
class Mult<'a> Expr<'a> Expr<'a> <: Expr<'a>
class Int<'a> int <: Expr<'a>
class Var<'a> 'a <: Expr<'a>

// Differentiation of expressions.
Expr<'a> diff(Expr<'a> v, 'a x); // Type parameters are now implicit,
indicated by starting with '.
diff(Var<'a> x, 'a y) = Int(x == y ? 1 : 0); // Now that we have that,
we can omit the override declaration (as we should have been able to
anyway).
diff(Int<'a> v, 'a x) = Int 0;
diff(Mult<'a> u v, 'a x) = u * diff(v, x) + diff(u, x) * v; // Note
use of destruction bind.
diff(Add<'a> u v, 'a x) = diff(u, x) + diff(v, x);

// Addition
Expr<'a> `+` (Expr<'a> fst, Expr<'a> snd) = Add(fst : fst, snd : snd);

// Multiplication
Expr<'a> `*` (Expr<'a> x, Expr<'a> y) = Mult(x, y);
`*` (Int<'a> 0, Expr<'a> x) = Int 0;
`*` (Int<'a> 1, Expr<'a> x = x
`*` (Expr<'a> x, Int<'a> 0) = Int 0; // Still have to repeat this. :(
`*` (Expr<'a> x, Int<'a> 1) = x;
`*` (Int<'a> i, Int<'a> j) = Int (i * j);
`*` (Expr<'a> x, Mult<'a> u v) = (x * u) * v;

// Stringification
String f(Expr<'a> t) = toString(t);
f(Add<'a> t) = "(" toString(t) ")";

toString (Int<'a> i) = toString(i);
toString (Var<'a> v) = toString(v);
toString (Mult<'a> u v) = f(u) + f(v);
toString (Add<'a> u v) = toString(u) + "+" + toString(v)

> From what I've seen, the multiple dispatch approach seems to be significantly
> worse (i.e. several times as much code) at handling a small subset of
> patterns (shallow and enumerated constructors) but it doesn't look as if it

I've no idea where this "several times as much code" is coming from.
The only place where the multiple dispatch was line for line longer
was in the inability to chain dispatches nicely (which the above
pseudo-code fixes) and the use of or patterns (which it admittedly
doesn't. I can't think of a way to fix that at the moment).

> will scale to deep or more sophisticated (catchall, guarded or or-patterns
> etc.).

Quite possibly not. Perhaps some mixture of the two is in order.

> Also, how do you distinguish between open and closed sum types and how does it

All sum types are open. Types may be final however.

> handle overlapping sum types?

You need to provide a specialisation of the method for the overlap.
(Nice has the annoying misfeature that it only complains at you when
you actually try to invoke the function on the overlap, but that's not
inherent to the concept).

> I guess I'm missing a killer example that uses multiple dispatch to achieve
> something that cannot be achieved so easily with pattern matching. Do you
> have such an example?

Well, so far all your examples have been along the lines of "Hey,
pattern matching is much better at being pattern matching than
multiple dispatch is!". This isn't very surprising. :-) Let's step
away from pattern matching style examples.

I'm just going to use pseudo-code now. I really can't be bothered with
Nice's syntax :-) (I briefly looked into writing an alternative
syntax for it at one point, but my compiler-fu was even weaker then
than it is now and I couldn't make head nor tail of the compiler).

Suppose we have some sequence abstract datatype. We can encode this in
the type system as follows:

interface Sequence<'a>;

(Sequence<'a> b') => {
'a nil();
b' `:` (a' x, b' xs);
Maybe<('a, 'b)> uncons('b seq);
fold ('a -> 'c -> 'c) -> 'c -> 'b -> 'c
}

(This is along the lines of Nice's abstract interfaces or Haskell's
type classes)

Now, suppose at some later date in a module far far away we want to define:

(Sequence<'a> b') => ('b, 'b) split('b seq, 'a -> boolean f);

This splits at the first sequence element for which f evaluates to
true. We do the obvious thing with cons, uncons, etc.

But there are datatypes for which this is a horrendously bad
implementation. And we *can't* write a universally good
implementation. What works well for a linked list will work terribly
for an array, and vice versa. But what works well for a linked list
will work pretty well for a finger tree or anything else with cheap
cons and uncons, so it's nice to have that default implementation in
there

No problem. We write a specialisation.

split(Array<'a> arr, 'a -> boolean f) = (do the obvious thing);

And carry on our merry way.

You could use modules and signatures to encode the abstract data type
in OCaml, but I don't see how you'd permit arbitrary specialisation
like that. In Haskell we often end up with adding a lot of extra
nonsense into the type class declarations with default implementations
in order to allow exactly these sorts of optimisations. (GHC has a
'specialise' pragma for other cases I think).

> > This one indeed *doesn't* translate as well as I'd like. This is
> > basically because of the inability for Nice to chain the dispatch
> > rules nicely.
> >
> > In particular there's no easy equivalent of the combining patterns,
> > and although we can dispatch on 0 and 1 specially this doesn't lift to
> > let us dispatch on Int 0 and Int 1 specially.
>
> Right, that's because the pattern is one deeper than before. Patterns can be
> arbitrarily deep, of course, and I think you will have to invent a new
> function for every depth of every pattern in general.

What do you mean by 'depth' here? Are we talking about or patterns or
nesting constructors? If the former, I think you're right. If the
latter, see pseudocode.

> > I'm not sure why you thought the associativity would be difficult though -
> > that works out fine.
>
> Again, this is because you've manually handled a 1-deep pattern by manually
> calling fst and snd to deconstruct it. If you want to deconstruct deeper then
> I think you'll have to resort to other techniques.

Right. But that's just a destructuring. It's not part of the actual match.

> > // Multiplication
> > <!T> Expr<T> `*` (int i, Expr<T> x) = new Mult(int(i), x);
> > override <!T> Expr<T> `*` (int i, Int<T> x) = int(i * x.i);
> > override <!T> Expr<T> `*` (0, x) = int(0);
> > override <!T> Expr<T> `*` (int 1, x) = x;
> >
> > <!T> Expr<T> `*` (Expr<T> x, int i) = new Mult(x, int i);
> > override <!T> Expr<T> `*` (Int<T> x, int i) = int(i * x.i);
> > override <!T> Expr<T> `*` (x, int 0) = int(0);
> > override <!T> Expr<T> `*` (x, int 1) = x;
> >
> > override <!T> Expr<T> `*` (Int<T> i, Expr<T> x) = i.i * x;
> > override <!T> Expr<T> `*` (Expr<T> x, Int<T> i) = i.i * x;
> > override <!T> Expr<T> `*` (Expr<T> x, Mult<T> y) = (x * y.fst) * y.snd;
>
> Which bit gets called when you multiply a Var by a Var?

Sorry, I already had a definition of * further up for convenience. See:

<!T> Expr<!T> `*` (Expr<!T> fst, Expr<!T> snd) = new Mult(fst : fst, snd : snd);

This is what gets called in the default case.

> > As you can see, it involves a bunch of repetition and some annoying hacks.
>
> Yes. Also, the use of overloading conflicts with type inference so you are
> forced to explicitly annotate all of the types yourself. Even if you could
> add features to the language to help, I do not see a way around that.

Some of that's a syntax issue, as mentioned above.

You *do* have to declare the parameter types, but for the most part
only in the same way that you have to declare the constructor names
when pattern matching. The types of the parameters are syntactically
meaningful, which is why they can't be inferred. Return types should
generally be able to be inferred a bit better than Nice manages I
think, unless you want to be able to dispatch on them (which is
possible in only some cases in Nice)

In general the overhead in Nice over basic pattern matching is
approximately equivalent to providing a type signature for each top
level function, which is often considered good practice anyway. There
is indeed not a good equivalent of or patterns, guards, etc.

> > Don't get me wrong - I agree that the OCaml code is better in places,
> > and there's not much in the Nice code which I can point to as better
> > (at least nothing relevant). Particularly the multiplication example.
> > There are various limitations on the Nice dispatch system which make
> > it hard to beat a good pattern matching system in this instance, and
> > the lack of destructuring bind is a real nuisance. But hopefully these
> > examples show that the basic facility to do what you want is there,
> > and none of these problems are inherent to the idea of multiple
> > dispatch. Additionally, I don't think the OCaml code is *much* better.
>
> OCaml certainly requires several times less code. IMHO, the aspect of OCaml
> that we are discussing is one of its strongest points. I have never seen
> another language come close to it here (including SML).

It does seem quite good. I might have to give OCaml another look.

> > There are a few problems which will always be there - it's much harder
> > to infer things in the presence of full blown nominal subtyping for
> > example. But on the other hand there are a lot of things which
> > multiple dispatch makes noticably easier as well. There are tradeoffs,
> > but I don't think either approach is obviously better than the other.
>
> If you are not aware of modern pattern matchers (i.e. since SML) then I think

Well, my SML is a bit rusty too. But the only bit of pattern matching
that couldn't be done in Haskell (which I know at least moderately
well) was the or pattern (which is very nice. I've definitely wanted
something like that before). The variants couldn't, but that's a type
system issue, not pattern matching per se.

You on the other hand seem perfectly willing to argue that pattern
matching is superior to multiple dispatch without knowing much about
it at all.

> it is premature to state "there are a lot of things which multiple dispatch
> makes noticably easier as well". Can you give any examples and I'll try to
> translate them into OCaml/F#?

I gave one above. Let's have a variation on your expression example.
In the interests of having something which actually compiles as
opposed to writing code in the "All of David's favourite features
rolled into one bundle" metalanguage, I've reverted to using Nice
here.

There are three types of expressions. Atoms, BinaryOps and UnaryOps.
Additionally UnaryOps can be functions. We want to print them (I'm
going to ignore precedence, as this example is intended to be open and
handling precedence generically is Annoying. If you want to include it
I'm happy to do so also).

Each type has a symbol. We want to convert these to strings as follows:

package expression;

// We don't need to declare this, but it will help for type safety.
interface Expr<!T> {

String symbol(); }

abstract class BinaryOp<!T> implements Expr<T> {
Expr<T> fst;
Expr<T> snd;
}

abstract class UnaryOp<!T> implements Expr<T>{
Expr<T> arg;
}

abstract class Atom<!T> implements Expr<T>{}

class Fact<!T> extends UnaryOp<T> {
symbol() = "!";
}

class Fun<!T> extends UnaryOp<T> {
String name;
symbol() = name;
}

class Add<!T> extends BinaryOp<T> {
symbol() = "+";
}
class Mult<!T> extends BinaryOp<T>{
symbol() = "*";
}

class Int<!T> extends Atom<T>{
int i;
symbol() = toString(i);
}

class Var<!T> extends Atom<T>{
T t;
symbol() = toString(t);
}

<!T> String toString(Atom<T> atom) = symbol(atom);
<!T> String toString(UnaryOp<T> unary) = toString(escape(unary.arg)) +
symbol(unary);
override <!T> String toString(Fun<T> fun) = symbol(fun) + escape(fun.arg);
<!T> String toString(BinaryOp<T> bin) = escape(bin.fst) + symbol(bin)
+ escape(bin.snd);

<!T> String escape (Expr<T> expr) = toString(expr);
override <!T> String escape (BinaryOp<T> expr) = "(" + toString(expr) + ")";

Relevant features: Although unary ops are normally postfix, functions
are prefix. Additionally, this needs to be open so we can add new
operators in other modules. toString should handle additions without
modification unless they require a specific change to the rules.

It seems to me that while multiple dispatch can only handle shallow
patterns, pattern matching can only handle shallow hierarchies. Maybe
OCaml has some killer feature which makes this not the case, but I
definitely don't see it in what you've shown so far.

It also occurs to me that there's absolutely no reason one couldn't
combine the two paradigms. You could pretty much drop in multiple
dispatch as a replacement for the constructor matching in a normal
pattern matching system. I wonder how well that would work. Maybe I
should give it a try. I've been wibbling about putting some of my
half-baked language ideas into practice for the best part of a year
now. :-)

Jon Harrop

unread,
Nov 11, 2007, 8:54:37โ€ฏAM11/11/07
to jvm-la...@googlegroups.com
On Sunday 11 November 2007 11:39, David MacIver wrote:
> On Nov 10, 2007 8:42 PM, Jon Harrop <j...@ffconsultancy.com> wrote:
> > That is certainly a different comparison but I do not understand why you
> > would consider it "fairer".
>
> Because a) Using explicitly named types has certain advantages, both
> in terms of the programmer's usage and encoding invariants in the
> program

Explicit types also have disadvantages. I think it is only fair to consider
both approaches.

> and b) Your claim that this is an advantage of the "ML family
> of languages" then actually holds. Currently you're comparing a very
> specific feature of OCaml (which its immediate descendant F# has
> inherited).

The ML family of languages basically is OCaml and F# now:

http://ocamlnews.blogspot.com/2007/11/most-popular-functional-languages-on.html

> ...


> Nice requires explicit packages for everything. This is a Java
> interoperability issue.

Ok. Do any languages offer multiple dispatch without this baggage?

> The syntax for Nice isn't very well thought out sometimes (it's
> designed to be too Java like to be truly elegant), and the particular
> issue here is that normally you can omit the return type of the
> overriding function (and the 'override' declaration when you do), but
> I couldn't figure out how to make that play well with the requirement
> of explicit type parameter declarations. A better approach would be to
> use some unambiguous notation for unbound type parameters. ('a, 'b,
> 'c, etc. would work for example). You can even omit it in this case,
> but the compiler warns you if you do.

What are the implications of the warning? Is the code no longer statically
typed?

> > let linear_coefficients = function
> >
> > | `Add((`Mul((`Int _ | `Var _ as m), `Var "x") |
> >
> > `Mul(`Var "x", (`Int _ | `Var _ as m))),
> > (`Int _ | `Var _ as c))
> >
> > | `Add((`Int _ | `Var _ as c),
> >
> > (`Mul((`Int _ | `Var _ as m), `Var "x") |
> > `Mul(`Var "x", (`Int _ | `Var _ as m)))) when m <> `Var "x" ->
> > Some(m, c)
> >
> > | _ -> None;;
> >
> > I suspect the multiple dispatch equivalent is enormously verbose.
>
> Yep, it probably is. I need to think about it. The or patterns don't
> translate well.

Yes.

> > | e -> string_of () e
> >
> > That translated to:
> >
> > <!T> String toString (Int<!T> i) = toString(i.i);
> > <!T> String toString (Var<!T> v) = toString(v.t);
> > <!T> String toString (Mult<!T> m) = f(m.fst) + f(m.snd);
> >
> > So you had to handle every other constructor individually by hand. That
> > is asymptotically more code.
>
> Either you've misunderstood the translation, or I've misunderstood
> what your code is doing:

I think I misunderstood the translation. Your first two lines define a
function called "f" that is equivalent to my "string_of_mul" function:

<!T> String f(Expr<!T> t) = toString(t);

override <!T> String f(Add<!T> t) = "(" toString(t) ")";

The first of those two lines is equivalent to my catchall pattern.

> > Finally, I suspect you cannot "include some type declarations it omits"
> > because I'm using polymorphic variants. The OO equivalent would be
> > something like class hierarchies with multiple inheritance from some
> > other (polymorphic) classes. You might be able to implement that in C++
> > using templates but I doubt many OO languages could represent it without
> > sacrificing static checking.
>
> I'm sure that's the case. Most good OO languages are dynamically
> typed. Scala is an exception to that, but I can't think of a good way
> to do it there. You can encode generic sum types in Scala's pattern
> matching without *too* much overhead, but it's pretty grim.

Yes. That is certainly the impression I got. OCaml and F# also provide OO
features but I think only F# allows an equivalent of multiple dispatch.

> Here's a translation of the Nice code.

> ...


> // Differentiation of expressions.
> Expr<'a> diff(Expr<'a> v, 'a x); // Type parameters are now implicit,
> indicated by starting with '.

You seem to be using both overloading and type inference. I do not believe
that is possible.

> `*` (Expr<'a> x, Int<'a> 0) = Int 0; // Still have to repeat this. :(

In general, you will have to factor out these arbitrarily-complicated repeated
expressions into separate functions.

> `*` (Expr<'a> x, Mult<'a> u v) = (x * u) * v;

This is relying upon the ability to destructure one level only, right?

> > From what I've seen, the multiple dispatch approach seems to be
> > significantly worse (i.e. several times as much code) at handling a small
> > subset of patterns (shallow and enumerated constructors) but it doesn't
> > look as if it
>
> I've no idea where this "several times as much code" is coming from.

The original 40 vs 5 line comparison.

> The only place where the multiple dispatch was line for line longer
> was in the inability to chain dispatches nicely (which the above
> pseudo-code fixes) and the use of or patterns (which it admittedly
> doesn't. I can't think of a way to fix that at the moment).

Exactly, yes.

> > Also, how do you distinguish between open and closed sum types and how
> > does it
>
> All sum types are open. Types may be final however.

If all sum types are open then you cannot provide static exhaustiveness and
redundancy checking?

> > handle overlapping sum types?
>
> You need to provide a specialisation of the method for the overlap.
> (Nice has the annoying misfeature that it only complains at you when
> you actually try to invoke the function on the overlap, but that's not
> inherent to the concept).

Ok.

> > I guess I'm missing a killer example that uses multiple dispatch to
> > achieve something that cannot be achieved so easily with pattern
> > matching. Do you have such an example?
>
> Well, so far all your examples have been along the lines of "Hey,
> pattern matching is much better at being pattern matching than
> multiple dispatch is!". This isn't very surprising. :-) Let's step
> away from pattern matching style examples.

Sure.

I'm not sure I completely understand the problem but can you not simply have:

Seq.split : ('a -> bool) -> 'a seq -> 'a seq * 'a seq

and a specialized version for arrays:

Array.split : ('a -> bool) -> 'a array -> 'a array * 'a array

Are you asking for run-time dispatch to type specialized functions?

> > Right, that's because the pattern is one deeper than before. Patterns can
> > be arbitrarily deep, of course, and I think you will have to invent a new
> > function for every depth of every pattern in general.
>
> What do you mean by 'depth' here? Are we talking about or patterns or
> nesting constructors? If the former, I think you're right. If the
> latter, see pseudocode.

I was referring to A(1, B(2, C(3, D 4))) and so on as "deep patterns".

> > Again, this is because you've manually handled a 1-deep pattern by
> > manually calling fst and snd to deconstruct it. If you want to
> > deconstruct deeper then I think you'll have to resort to other
> > techniques.
>
> Right. But that's just a destructuring. It's not part of the actual match.

Well, destructuring is part of pattern matching, of course.

> > > // Multiplication
> > > <!T> Expr<T> `*` (int i, Expr<T> x) = new Mult(int(i), x);
> > > override <!T> Expr<T> `*` (int i, Int<T> x) = int(i * x.i);
> > > override <!T> Expr<T> `*` (0, x) = int(0);
> > > override <!T> Expr<T> `*` (int 1, x) = x;
> > >
> > > <!T> Expr<T> `*` (Expr<T> x, int i) = new Mult(x, int i);
> > > override <!T> Expr<T> `*` (Int<T> x, int i) = int(i * x.i);
> > > override <!T> Expr<T> `*` (x, int 0) = int(0);
> > > override <!T> Expr<T> `*` (x, int 1) = x;
> > >
> > > override <!T> Expr<T> `*` (Int<T> i, Expr<T> x) = i.i * x;
> > > override <!T> Expr<T> `*` (Expr<T> x, Int<T> i) = i.i * x;
> > > override <!T> Expr<T> `*` (Expr<T> x, Mult<T> y) = (x * y.fst) * y.snd;
> >
> > Which bit gets called when you multiply a Var by a Var?
>
> Sorry, I already had a definition of * further up for convenience. See:
>
> <!T> Expr<!T> `*` (Expr<!T> fst, Expr<!T> snd) = new Mult(fst : fst, snd :
> snd);
>
> This is what gets called in the default case.

Ok. So 6 lines of OCaml became 12 lines with multiple dispatch (excluding the
type declaration, imports etc.).

> > Yes. Also, the use of overloading conflicts with type inference so you
> > are forced to explicitly annotate all of the types yourself. Even if you
> > could add features to the language to help, I do not see a way around
> > that.
>
> Some of that's a syntax issue, as mentioned above.
>
> You *do* have to declare the parameter types, but for the most part
> only in the same way that you have to declare the constructor names
> when pattern matching. The types of the parameters are syntactically
> meaningful, which is why they can't be inferred. Return types should
> generally be able to be inferred a bit better than Nice manages I
> think, unless you want to be able to dispatch on them (which is
> possible in only some cases in Nice)

Right, but choosing overloading has cost you type inference and automatic
generalization. That is one of my main gripes with Scala because it leads to
enormous verbosity.

> In general the overhead in Nice over basic pattern matching is
> approximately equivalent to providing a type signature for each top
> level function, which is often considered good practice anyway. There
> is indeed not a good equivalent of or patterns, guards, etc.

Per-function type annotations are considered good practice in Haskell because
it has an undecideable type system but not in ML, where you generally see no
type annotations whatsoever.

> > If you are not aware of modern pattern matchers (i.e. since SML) then I
> > think
>
> Well, my SML is a bit rusty too. But the only bit of pattern matching
> that couldn't be done in Haskell (which I know at least moderately
> well) was the or pattern (which is very nice. I've definitely wanted
> something like that before). The variants couldn't, but that's a type
> system issue, not pattern matching per se.

They are tied together:

http://caml.inria.fr/pub/papers/garrigue-deep-variants-2004.pdf

> You on the other hand seem perfectly willing to argue that pattern
> matching is superior to multiple dispatch without knowing much about
> it at all.

I'm not arguing that pattern matcher is better. I'm simply saying that I
refuse to believe that without any evidence.

This is my attempt at translating Jacques Garrigue's "Code reuse through
polymorphic variants" paper to this problem:

type 'a expr =
[ `BinOp of [ `Add | `Mul ] * 'a * 'a
| `Int of int
| `UnOp of [ `Fact | `Fun of string ] * 'a
| `Var of string ]

let symbol_of_unop = function
| `Fact -> "!"
| `Fun f -> f

let symbol_of_binop = function
| `Add -> "+"
| `Mul -> "*"

let s s unop binop expr =
match expr with
| `Int n -> string_of_int n
| `Var s -> s
| `UnOp(op, f) -> s f ^ unop op
| `BinOp(op, f, g) -> s f ^ binop op ^ s g

let rec string_of_expr expr =
s string_of_expr symbol_of_unop symbol_of_binop expr

The essence of this solution is to untie the recursive knot of the sum type.
So the recursive expr type has been replaced by the type "'a expr as 'a".
This explicit recursion makes the expr type extensible because we can use:

type expr2 = [ expr2 expr | ... ]

> Relevant features: Although unary ops are normally postfix, functions
> are prefix. Additionally, this needs to be open so we can add new
> operators in other modules. toString should handle additions without
> modification unless they require a specific change to the rules.

You can still extend the toString function, albeit with a fixed amount of
boiler-plate code at each extension to close the recursion again and make it
usable. For example, adding floats:

type 'a expr2 = [ 'a expr | `Float of float ]

let s2 s2 unop binop expr =
match expr with
| `Float x -> string_of_float x
| #expr as expr -> s s2 unop binop expr

let rec string_of_expr2 (expr : 'a expr2 as 'a) =
s2 string_of_expr2 symbol_of_unop symbol_of_binop expr

> It seems to me that while multiple dispatch can only handle shallow
> patterns, pattern matching can only handle shallow hierarchies. Maybe
> OCaml has some killer feature which makes this not the case, but I
> definitely don't see it in what you've shown so far.

I think there are two approaches based upon pattern matching that are relevant
here:

Firstly, you have ordinary nested variant patterns. Using this approach, you
must replace all functions that depend upon the type you extend, which is
asymptotically more code than the OOP approach.

Secondly, you have polymorphic variants in OCaml (not SML, Haskell or F#) that
provide overlapping sum types, inferred sum types and open sum types. These
provide the same extensibility as the OOP approach.

However, every time you add any extensibility like this you are undermining
static checking and performance. If you opt for complete extensibility then
that basically obviates all static checking and you can kiss goodbye to
competitive performance. In all of my time coding, I have never used the
above design pattern to make my code extensible. In fact, I would strongly
advise against it because you rapidly end up with an unmaintainable rats nest
of dependencies. You'd need some pretty fancy whole program analysis and
refactoring to recover such a project once it gets hairy.

> It also occurs to me that there's absolutely no reason one couldn't
> combine the two paradigms. You could pretty much drop in multiple
> dispatch as a replacement for the constructor matching in a normal
> pattern matching system. I wonder how well that would work. Maybe I
> should give it a try. I've been wibbling about putting some of my
> half-baked language ideas into practice for the best part of a year
> now. :-)

You could probably retrofit that onto OCaml quite easily as a macro.

David MacIver

unread,
Nov 11, 2007, 10:34:05โ€ฏAM11/11/07
to jvm-la...@googlegroups.com
On Nov 11, 2007 1:54 PM, Jon Harrop <j...@ffconsultancy.com> wrote:
>
> On Sunday 11 November 2007 11:39, David MacIver wrote:
> > On Nov 10, 2007 8:42 PM, Jon Harrop <j...@ffconsultancy.com> wrote:
> > > That is certainly a different comparison but I do not understand why you
> > > would consider it "fairer".
> >
> > Because a) Using explicitly named types has certain advantages, both
> > in terms of the programmer's usage and encoding invariants in the
> > program
>
> Explicit types also have disadvantages. I think it is only fair to consider
> both approaches.

Certainly. I'm perfectly willing to consider implicit types as a
valuable thing. But it's an independent discussion.

> > and b) Your claim that this is an advantage of the "ML family
> > of languages" then actually holds. Currently you're comparing a very
> > specific feature of OCaml (which its immediate descendant F# has
> > inherited).
>
> The ML family of languages basically is OCaml and F# now:

I'm sure that will come as news to the people working on MLTon, Alice, SML/NJ...

> http://ocamlnews.blogspot.com/2007/11/most-popular-functional-languages-on.html

That's using a rather specific metric. I rarely use apt for language
runtimes/compilers I actually care about because it's usually a few
versions behind.

> > ...
> > Nice requires explicit packages for everything. This is a Java
> > interoperability issue.
>
> Ok. Do any languages offer multiple dispatch without this baggage?

Do I need to jump up and down and shout 'CLOS' a few more times? :-)
There are a wide variety of multiple dispatch object systems for
Scheme as well. Clojure appears to offer one, but I don't know what
its approach to packages is. In general, multiple dispatch seems
rather popular in the Lisp world.

If you want a statically typed version there's also Needle (
http://www.nongnu.org/needle/ ), but I've never used it and the
project seems mostly dead, and Jazz ( http://www.exalead.com/jazz/ ),
which I don't know the current state of.

In general there really is a lack of a good language which combines
these features and is actually still alive. My conjecture is that this
is because the OO fans go "Eww, multiple dispatch! Give me single
dispatch or give me death!", the FP people go "Weird. It looks like
pattern matching but... different" and the Lispers go "Oh noes! Itth
thtatically typed! Run!". :-) It's a style which sits in an odd middle
ground that hasn't got much attention.

> What are the implications of the warning? Is the code no longer statically
> typed?

The idea seems to just be to make it clear when one method overrides
another. Because of Nice's ad hoc overloading on top of everything
else it isn't always.

> > > | e -> string_of () e
> > >
> > > That translated to:
> > >
> > > <!T> String toString (Int<!T> i) = toString(i.i);
> > > <!T> String toString (Var<!T> v) = toString(v.t);
> > > <!T> String toString (Mult<!T> m) = f(m.fst) + f(m.snd);
> > >
> > > So you had to handle every other constructor individually by hand. That
> > > is asymptotically more code.
> >
> > Either you've misunderstood the translation, or I've misunderstood
> > what your code is doing:
>
> I think I misunderstood the translation. Your first two lines define a
> function called "f" that is equivalent to my "string_of_mul" function:

Yes, sorry, I should have followed a similar naming convention.

> <!T> String f(Expr<!T> t) = toString(t);
> override <!T> String f(Add<!T> t) = "(" toString(t) ")";
>
> The first of those two lines is equivalent to my catchall pattern.

Right. The catch alls translate to a default implementation which
others override.

> > Here's a translation of the Nice code.
> > ...
> > // Differentiation of expressions.
> > Expr<'a> diff(Expr<'a> v, 'a x); // Type parameters are now implicit,
> > indicated by starting with '.
>
> You seem to be using both overloading and type inference. I do not believe
> that is possible.

The return types can be inferred because the function is an override
(you still need to specify the return type of the top level version).
The rest can't. The implicit type parameters aren't inferred - it's
just that 'a introduces a new type parameter where it's not already in
scope. It's just syntactic sugar.

> > `*` (Expr<'a> x, Int<'a> 0) = Int 0; // Still have to repeat this. :(
>
> In general, you will have to factor out these arbitrarily-complicated repeated
> expressions into separate functions.

Yes, I think that's probably the case. See comments about how adding
some form of pattern matching on top of this would probably be really
useful.

> > `*` (Expr<'a> x, Mult<'a> u v) = (x * u) * v;
>
> This is relying upon the ability to destructure one level only, right?

There's in principle no reason why you can't nest this arbitrarily.
It's just that the implementation gets more complicated. This code
didn't seem to need it.

> > > From what I've seen, the multiple dispatch approach seems to be
> > > significantly worse (i.e. several times as much code) at handling a small
> > > subset of patterns (shallow and enumerated constructors) but it doesn't
> > > look as if it
> >
> > I've no idea where this "several times as much code" is coming from.
>
> The original 40 vs 5 line comparison.

Ok. But the Nice example is significantly closer. The pseudocode (of
which about 70% can be mechanically translated into Nice and the
remainder requires some more intelligent compilation). The original
code was basically Java. If you're going to argue that Java is more
verbose than OCaml I don't think you'll gain much except a chorus of
agreement. :-)

> > > Also, how do you distinguish between open and closed sum types and how
> > > does it
> >
> > All sum types are open. Types may be final however.
>
> If all sum types are open then you cannot provide static exhaustiveness and
> redundancy checking?

It's true that you can't provide redundancy checking, because
redundancy isn't actually possible. Exhaustiveness can still be
checked though. It's just that one more often needs a default
implementation to guarantee it than one would need catch all patterns
in OCaml.

> I'm not sure I completely understand the problem but can you not simply have:
>
> Seq.split : ('a -> bool) -> 'a seq -> 'a seq * 'a seq
>
> and a specialized version for arrays:
>
> Array.split : ('a -> bool) -> 'a array -> 'a array * 'a array

You can. But then if you want a function which depends only on the
ability to split things you can't write one which accepts an arbitrary
sequence and splits it - every time you want to specialise to arrays
you need to write a separate version of that function which only
accepts arrays in order to make use of that. And if you wanted to
write a specialised split for finger trees you'd need to do that to.

Now suppose we wanted to define (for example. I can't actually think
of a use for it :-) ):

(Sequence<'a> 'b, Sequence<'a> 'c) splits ('a -> bool f, 'b x, 'c y) =
(split(f, x), split(f, y))

In your version we would need to provide potentially 9 different
versions of this! One for each combination of 'finger tree', 'array'
and 'anything else'.

> Are you asking for run-time dispatch to type specialized functions?

Yes (Although it is often resolvable at compile time. e.g. if I apply
split to something known to be an array at compile time it shouldn't
cause a runtime check on the type). That's exactly what multiple
dispatch *does*.

> > What do you mean by 'depth' here? Are we talking about or patterns or
> > nesting constructors? If the former, I think you're right. If the
> > latter, see pseudocode.
>
> I was referring to A(1, B(2, C(3, D 4))) and so on as "deep patterns".

Ok. But there's no reason those are noticably harder (except for at
the implementation level in order to make them efficient) than
destructuring a single level. If I can destructure x in f(x) to be (1,
y) then there's no reason I can't destructure y to be (2, z).

> > Right. But that's just a destructuring. It's not part of the actual match.
>
> Well, destructuring is part of pattern matching, of course.

Yes. But it's also possible independently of pattern matching (even
Javascript can do it, although I don't know if it handles nesting).
It's incredibly easy to implement - what's difficult is inferring the
necessary patterns from each destructuring. In particular I'm trying
to point out that destructuring + multiple dispatch is equivalently
powerful to (basic, i.e. sans or patterns) pattern matching.

> > This is what gets called in the default case.
>
> Ok. So 6 lines of OCaml became 12 lines with multiple dispatch (excluding the
> type declaration, imports etc.).

Yes. In the pseudo-code it becomes more competitive (although still
longer). (One could, incidentally, cut out a lot of the mess by
defining an overload that `*` (Expr<T> x, Int<T> i) = i * x, but that
changes the semantic meaning in general even though it's ok here, so I
wanted to avoid it).

> Right, but choosing overloading has cost you type inference and automatic
> generalization. That is one of my main gripes with Scala because it leads to

The ad hoc overloading certainly does. I rather wish Nice didn't have
it. Pure ML-sub can retain a reasonable amount of type inference.
Presumably pure ML sub with structural interfaces (the core of Nice's
type system) can too.

> enormous verbosity.
>
> > In general the overhead in Nice over basic pattern matching is
> > approximately equivalent to providing a type signature for each top
> > level function, which is often considered good practice anyway. There
> > is indeed not a good equivalent of or patterns, guards, etc.
>
> Per-function type annotations are considered good practice in Haskell because
> it has an undecideable type system but not in ML, where you generally see no
> type annotations whatsoever.

Haskell's core type system isn't undecidable as far as I know. Type
inference is entirely possible and no less straightforward than in ML
with Haskell's core Hindley Milner + type classes type system. It's
more a case of readability and explicit public contracts for the
module. Haskell's type system + a bazillion add-ons found only in GHC
head is of course. ;-)

> > > If you are not aware of modern pattern matchers (i.e. since SML) then I
> > > think
> >
> > Well, my SML is a bit rusty too. But the only bit of pattern matching
> > that couldn't be done in Haskell (which I know at least moderately
> > well) was the or pattern (which is very nice. I've definitely wanted
> > something like that before). The variants couldn't, but that's a type
> > system issue, not pattern matching per se.
>
> They are tied together:
>
> http://caml.inria.fr/pub/papers/garrigue-deep-variants-2004.pdf

Argh. Not another paper to read. My pile of such is a mile high at the
moment. ;-)

Interesting. I note that the type parameter on expr seems to have
become a String. Is this a necessary change? (Note that mine works
with any type parameter T because the toString method can accept
different types generically).

It also looks rather like you're manually compiling type hierarchies
into pattern matching. ;-)

> > Relevant features: Although unary ops are normally postfix, functions
> > are prefix. Additionally, this needs to be open so we can add new
> > operators in other modules. toString should handle additions without
> > modification unless they require a specific change to the rules.
>
> You can still extend the toString function, albeit with a fixed amount of
> boiler-plate code at each extension to close the recursion again and make it
> usable. For example, adding floats:
>
> type 'a expr2 = [ 'a expr | `Float of float ]
>
> let s2 s2 unop binop expr =
> match expr with
> | `Float x -> string_of_float x
> | #expr as expr -> s s2 unop binop expr
>
> let rec string_of_expr2 (expr : 'a expr2 as 'a) =
> s2 string_of_expr2 symbol_of_unop symbol_of_binop expr

Ok. So it isn't extensible at all. It merely allows you to build
similar things on top of it, and if I were to write a function which
accepts an expr it wouldn't be able to accept an expr2, even if the
function definition had a perfectly valid catch all term, right?

So this seems to do significantly less than the multiple dispatch
version in terms of allowing for extensibility.

> > It seems to me that while multiple dispatch can only handle shallow
> > patterns, pattern matching can only handle shallow hierarchies. Maybe
> > OCaml has some killer feature which makes this not the case, but I
> > definitely don't see it in what you've shown so far.
>
> I think there are two approaches based upon pattern matching that are relevant
> here:
>
> Firstly, you have ordinary nested variant patterns. Using this approach, you
> must replace all functions that depend upon the type you extend, which is
> asymptotically more code than the OOP approach.
>
> Secondly, you have polymorphic variants in OCaml (not SML, Haskell or F#) that
> provide overlapping sum types, inferred sum types and open sum types. These
> provide the same extensibility as the OOP approach.

No they don't. They provide a different set of extensibilities - some
more, some less. In particular they provide no support for
encapsulation of type (see something like my Sequence example above)
or for one constructor inheriting from another. On the other hand the
structural subtyping and the ability to have overlaps is very nice,
and is something that most (statically typed - it Just Works in
dynamically typed languages, but you lose compiler checking) OOP
languages don't provide.

> However, every time you add any extensibility like this you are undermining
> static checking and performance. If you opt for complete extensibility then

You've not pointed out any problems with static checking in Nice.
Indeed there's a soundness proof for ML-sub. And we wouldn't want to
use a language without a soundness proof for its type system, would
we? ;-)

> that basically obviates all static checking and you can kiss goodbye to
> competitive performance. In all of my time coding, I have never used the

Depends what you mean by "competitive performance". Nice performance
is generally within a factor of two of Java's according to the
language shootout, and its dispatch mechanism could still use a fair
bit of optimising as I understand it (which, I suspect, it will never
receive).

> above design pattern to make my code extensible. In fact, I would strongly
> advise against it because you rapidly end up with an unmaintainable rats nest
> of dependencies. You'd need some pretty fancy whole program analysis and
> refactoring to recover such a project once it gets hairy.

Ah. The "My language deficiency is actually a feature" argument.

> > It also occurs to me that there's absolutely no reason one couldn't
> > combine the two paradigms. You could pretty much drop in multiple
> > dispatch as a replacement for the constructor matching in a normal
> > pattern matching system. I wonder how well that would work. Maybe I
> > should give it a try. I've been wibbling about putting some of my
> > half-baked language ideas into practice for the best part of a year
> > now. :-)
>
> You could probably retrofit that onto OCaml quite easily as a macro.

I doubt it.

David MacIver

unread,
Nov 11, 2007, 2:05:47โ€ฏPM11/11/07
to jvm-la...@googlegroups.com
On Nov 11, 2007 3:34 PM, David MacIver <david....@gmail.com> wrote:
> > Per-function type annotations are considered good practice in Haskell because
> > it has an undecideable type system but not in ML, where you generally see no
> > type annotations whatsoever.
>
> Haskell's core type system isn't undecidable as far as I know. Type
> inference is entirely possible and no less straightforward than in ML
> with Haskell's core Hindley Milner + type classes type system. It's
> more a case of readability and explicit public contracts for the
> module. Haskell's type system + a bazillion add-ons found only in GHC
> head is of course. ;-)

Sorry, talking at cross purposes (and I made a mistake). Type
*checking* for the core type system is decidable. Type inference isn't
(e.g. show(read "1") needs a type annotation).

In general I'm willing to live with features that can't be inferred as
long as a large enough subset of the language can be. Adding explicit
type annotations for features which allow more expresivity doesn't
strike me as an unreasonable trade off.

Jon Harrop

unread,
Nov 11, 2007, 2:11:08โ€ฏPM11/11/07
to jvm-la...@googlegroups.com
On Sunday 11 November 2007 15:34, David MacIver wrote:
> On Nov 11, 2007 1:54 PM, Jon Harrop <j...@ffconsultancy.com> wrote:
> > Explicit types also have disadvantages. I think it is only fair to
> > consider both approaches.
>
> Certainly. I'm perfectly willing to consider implicit types as a
> valuable thing. But it's an independent discussion.

Ah, ok. I think you will have trouble distinguishing between inferred sum
types and extensible sum types in OCaml because they are provided by the same
construct. Indeed, the inference is an integral part of their extensibility.

> > The ML family of languages basically is OCaml and F# now:
>
> I'm sure that will come as news to the people working on MLTon, Alice,
> SML/NJ...

I doubt it. SML/NJ is the most popular SML compiler and has never been updated
to support x86-64. MLton is also popular but has only rudimentary support for
x86-64, with poor performance and regular segfaults. We sell more books on
OCaml than all SML books combined. The OCaml mailing lists have had 7,000
posts this year whereas .

> > http://ocamlnews.blogspot.com/2007/11/most-popular-functional-languages-o


> >n.html
>
> That's using a rather specific metric. I rarely use apt for language
> runtimes/compilers I actually care about because it's usually a few
> versions behind.

There would have to be thousands of people installing MLton from source but
not OCaml from source for that to be significant. I doubt that is the case.
In fact, I suspect OCaml's lead is larger than those results indicate because
most SML programmers will have more than one of SMLNJ, MLton and MosML
installed.

> > > ...
> > > Nice requires explicit packages for everything. This is a Java
> > > interoperability issue.
> >
> > Ok. Do any languages offer multiple dispatch without this baggage?
>
> Do I need to jump up and down and shout 'CLOS' a few more times? :-)
> There are a wide variety of multiple dispatch object systems for
> Scheme as well. Clojure appears to offer one, but I don't know what
> its approach to packages is. In general, multiple dispatch seems
> rather popular in the Lisp world.
>
> If you want a statically typed version there's also Needle (
> http://www.nongnu.org/needle/ ), but I've never used it and the
> project seems mostly dead, and Jazz ( http://www.exalead.com/jazz/ ),
> which I don't know the current state of.
>
> In general there really is a lack of a good language which combines
> these features and is actually still alive. My conjecture is that this
> is because the OO fans go "Eww, multiple dispatch! Give me single
> dispatch or give me death!", the FP people go "Weird. It looks like
> pattern matching but... different" and the Lispers go "Oh noes! Itth
> thtatically typed! Run!". :-) It's a style which sits in an odd middle
> ground that hasn't got much attention.

LOL. Interesting perspective. :-)

> > <!T> String f(Expr<!T> t) = toString(t);
> > override <!T> String f(Add<!T> t) = "(" toString(t) ")";
> >
> > The first of those two lines is equivalent to my catchall pattern.
>
> Right. The catch alls translate to a default implementation which
> others override.

That makes sense.

> > > Here's a translation of the Nice code.
> > > ...
> > > // Differentiation of expressions.
> > > Expr<'a> diff(Expr<'a> v, 'a x); // Type parameters are now implicit,
> > > indicated by starting with '.
> >
> > You seem to be using both overloading and type inference. I do not
> > believe that is possible.
>
> The return types can be inferred because the function is an override
> (you still need to specify the return type of the top level version).
> The rest can't. The implicit type parameters aren't inferred - it's
> just that 'a introduces a new type parameter where it's not already in
> scope. It's just syntactic sugar.

Ok.

> > > `*` (Expr<'a> x, Int<'a> 0) = Int 0; // Still have to repeat this. :(
> >
> > In general, you will have to factor out these arbitrarily-complicated
> > repeated expressions into separate functions.
>
> Yes, I think that's probably the case. See comments about how adding
> some form of pattern matching on top of this would probably be really
> useful.

Right. This is all about or-patterns. Vanilla SML doesn't have or-patterns and
that is another reason why I prefer OCaml.

> > > `*` (Expr<'a> x, Mult<'a> u v) = (x * u) * v;
> >
> > This is relying upon the ability to destructure one level only, right?
>
> There's in principle no reason why you can't nest this arbitrarily.
> It's just that the implementation gets more complicated. This code
> didn't seem to need it.

I think if you do generalize that then you literally have pattern matching.

> > The original 40 vs 5 line comparison.
>
> Ok. But the Nice example is significantly closer. The pseudocode (of
> which about 70% can be mechanically translated into Nice and the
> remainder requires some more intelligent compilation). The original
> code was basically Java. If you're going to argue that Java is more
> verbose than OCaml I don't think you'll gain much except a chorus of
> agreement. :-)

:-)

> > If all sum types are open then you cannot provide static exhaustiveness
> > and redundancy checking?
>
> It's true that you can't provide redundancy checking, because
> redundancy isn't actually possible. Exhaustiveness can still be
> checked though. It's just that one more often needs a default
> implementation to guarantee it than one would need catch all patterns
> in OCaml.

Ok. That is one of the design flaws in Scala that cost it enormously in
productivity, IMHO.

> > I'm not sure I completely understand the problem but can you not simply
> > have:
> >
> > Seq.split : ('a -> bool) -> 'a seq -> 'a seq * 'a seq
> >
> > and a specialized version for arrays:
> >
> > Array.split : ('a -> bool) -> 'a array -> 'a array * 'a array
>
> You can. But then if you want a function which depends only on the
> ability to split things you can't write one which accepts an arbitrary
> sequence and splits it - every time you want to specialise to arrays
> you need to write a separate version of that function which only
> accepts arrays in order to make use of that. And if you wanted to write a
> specialised split for finger trees you'd need to do that to.
>
> Now suppose we wanted to define (for example. I can't actually think
> of a use for it :-) ):
>
> (Sequence<'a> 'b, Sequence<'a> 'c) splits ('a -> bool f, 'b x, 'c y) =
> (split(f, x), split(f, y))

OCaml programmers have been exposed to a solution to that problem for several
years but it never gained traction. Specifically, Jacques Garrigue published
a library that does exactly this using OCaml's OOP to perform the run-time
dispatch for functions over containers (exactly what you are talking about).

One of the main advantages is that you can now invoke members like "fold"
without an implicit type. So the old code:

let list_of_array a =
Array.fold_right (fun h t -> h::t) a []

may be written in a generalized form:

let list_of a =
a#fold_right (fun h t -> h::t) []

This alternative was not adopted for three reasons. Firstly, this incurs a
performance overhead that can be very significant (e.g. for empty arrays in
the above case the time taken is dominated by the dynamic dispatch to the
fold_right member). Secondly, in the vast majority of practically important
cases you can obtain the same genericity by simply passing the relevant
function to a higher-order function:

let list_of fold_right a =
fold_right (fun h t -> h::t) a []

In more complicated cases, you can parameterize over the container type using
functors rather than higher-order functions. Finally, in the vast majority of
cases you know exactly what data structure you're using anyway, so the
ability to treat something as an arbitrary sequence is basically useless.

The disadvantages of the sequence approach are quite severe: the genericity
weakens the type system resulting in obfuscated error messages and less
reliable code. So the approach essentially undermines static typing, which is
not surprising because you are effectively reinventing dynamic typing.

So, again, you can solve this problem in OCaml but it is generally not a
problem worth solving.

> In your version we would need to provide potentially 9 different
> versions of this! One for each combination of 'finger tree', 'array'
> and 'anything else'.

Perhaps you didn't mean to choose a trivially decomposable problem but the
solution in that case is simply:

let split2 f a b = Seq.split f a, Seq.split f b

> > Are you asking for run-time dispatch to type specialized functions?
>
> Yes (Although it is often resolvable at compile time. e.g. if I apply
> split to something known to be an array at compile time it shouldn't
> cause a runtime check on the type). That's exactly what multiple
> dispatch *does*.

Right. The OCaml equivalent will never be resolved statically but it is
trivial to implement.

> > > What do you mean by 'depth' here? Are we talking about or patterns or
> > > nesting constructors? If the former, I think you're right. If the
> > > latter, see pseudocode.
> >
> > I was referring to A(1, B(2, C(3, D 4))) and so on as "deep patterns".
>
> Ok. But there's no reason those are noticably harder (except for at
> the implementation level in order to make them efficient) than
> destructuring a single level. If I can destructure x in f(x) to be (1,
> y) then there's no reason I can't destructure y to be (2, z).

I was careful when I chose my example. You are quite correct for a "singleton"
pattern that necessarily involves no dispatch such as (1, (2, z)) but the
pattern I gave might require run-time dispatch. If you handle that then you
are just reinventing pattern matching.

As an aside, I don't like the implication that pattern matching over product
types (what you call "deconstruction") should be treated differently to
pattern matching over sum types (what you call "multiple dispatch"): this is
the essence of ML-style pattern matching that makes it so powerful.

> > > Right. But that's just a destructuring. It's not part of the actual
> > > match.
> >
> > Well, destructuring is part of pattern matching, of course.
>
> Yes. But it's also possible independently of pattern matching (even
> Javascript can do it, although I don't know if it handles nesting).

Yes.

> It's incredibly easy to implement - what's difficult is inferring the
> necessary patterns from each destructuring. In particular I'm trying
> to point out that destructuring + multiple dispatch is equivalently
> powerful to (basic, i.e. sans or patterns) pattern matching.

I would say that destructuring + multiple dispatch is a basic form of pattern
matching (without or-patterns, guards and most of the other features added
since SML).

> > > This is what gets called in the default case.
> >
> > Ok. So 6 lines of OCaml became 12 lines with multiple dispatch (excluding
> > the type declaration, imports etc.).
>
> Yes. In the pseudo-code it becomes more competitive (although still
> longer). (One could, incidentally, cut out a lot of the mess by
> defining an overload that `*` (Expr<T> x, Int<T> i) = i * x, but that
> changes the semantic meaning in general even though it's ok here, so I
> wanted to avoid it).

Can you elaborate on that?

> > Right, but choosing overloading has cost you type inference and automatic
> > generalization. That is one of my main gripes with Scala because it leads
> > to
>
> The ad hoc overloading certainly does. I rather wish Nice didn't have
> it. Pure ML-sub can retain a reasonable amount of type inference.
> Presumably pure ML sub with structural interfaces (the core of Nice's
> type system) can too.

OCaml's approach is similar: you can fall back to SML style coding. In fact,
when given the choice, the vast majority of OCaml programmers write in SML
style.

> > Per-function type annotations are considered good practice in Haskell

> > because it has an undecidable type system but not in ML, where you


> > generally see no type annotations whatsoever.
>
> Haskell's core type system isn't undecidable as far as I know. Type
> inference is entirely possible and no less straightforward than in ML
> with Haskell's core Hindley Milner + type classes type system. It's
> more a case of readability and explicit public contracts for the
> module. Haskell's type system + a bazillion add-ons found only in GHC
> head is of course. ;-)

Haskell supports both polymorphic recursion and arbitrary-rank types (for
generic monads to work) and both make the type system undecidable without
manual type annotations.

MLs take the decidable approach and even needlessly prohibit inferred type
recursion. Consequently, ML code doesn't need superfluous type annotations to
keep errors under control as Haskell does.

> > type 'a expr =
> > [ `BinOp of [ `Add | `Mul ] * 'a * 'a
> >
> > | `Int of int
> > | `UnOp of [ `Fact | `Fun of string ] * 'a
> > | `Var of string ]
> >
> > let symbol_of_unop = function
> >
> > | `Fact -> "!"
> > | `Fun f -> f
> >
> > let symbol_of_binop = function
> >
> > | `Add -> "+"
> > | `Mul -> "*"
> >
> > let s s unop binop expr =
> > match expr with
> >
> > | `Int n -> string_of_int n
> > | `Var s -> s
> > | `UnOp(op, f) -> s f ^ unop op
> > | `BinOp(op, f, g) -> s f ^ binop op ^ s g
> >
> > let rec string_of_expr expr =
> > s string_of_expr symbol_of_unop symbol_of_binop expr
>

> Interesting. I note that the type parameter on expr seems to have
> become a String. Is this a necessary change? (Note that mine works
> with any type parameter T because the toString method can accept
> different types generically).

Sure, I missed that. Its a trivial fix though:

type ('a, 'b) expr =
...
| `Var of 'b ]

etc.

> It also looks rather like you're manually compiling type hierarchies
> into pattern matching. ;-)

Exactly, yes. The only difference is that you're facing an asymptotic code
explosion and I'm not.

> > > Relevant features: Although unary ops are normally postfix, functions
> > > are prefix. Additionally, this needs to be open so we can add new
> > > operators in other modules. toString should handle additions without
> > > modification unless they require a specific change to the rules.
> >
> > You can still extend the toString function, albeit with a fixed amount of
> > boiler-plate code at each extension to close the recursion again and make
> > it usable. For example, adding floats:
> >
> > type 'a expr2 = [ 'a expr | `Float of float ]
> >
> > let s2 s2 unop binop expr =
> > match expr with
> > | `Float x -> string_of_float x
> > | #expr as expr -> s s2 unop binop expr
> >
> > let rec string_of_expr2 (expr : 'a expr2 as 'a) =
> > s2 string_of_expr2 symbol_of_unop symbol_of_binop expr
>
> Ok. So it isn't extensible at all. It merely allows you to build
> similar things on top of it,

Extensible means "can expand or add to its capabilities".

> and if I were to write a function which
> accepts an expr it wouldn't be able to accept an expr2, even if the
> function definition had a perfectly valid catch all term, right?

Wrong. If the function had a "perfectly valid catch all term" then it would
accept any supertype.

The expr type is a subtype of expr2 so you can pass an expr to a function
expecting an expr2 and it will work. If you write a generic function like
this:

# let is_int = function
| `Int _ -> true
| _ -> false;;
val is_int : [> `Int of 'a ] -> bool = <fun>

The argument type of this "is_int" function is a subtype of both expr and
expr2, so you can pass it values of either type.

Note how OCaml's polymorphic variants allow the subtyping relationships to be
inferred: there is no need to manually write out lots of new class
hierachies.

> So this seems to do significantly less than the multiple dispatch
> version in terms of allowing for extensibility.

Can you give a specific example?

> > Secondly, you have polymorphic variants in OCaml (not SML, Haskell or F#)
> > that provide overlapping sum types, inferred sum types and open sum
> > types. These provide the same extensibility as the OOP approach.
>
> No they don't. They provide a different set of extensibilities - some
> more, some less. In particular they provide no support for
> encapsulation of type (see something like my Sequence example above)

That is supported (see above).

> or for one constructor inheriting from another...

There are no constructors, only functions.

> > However, every time you add any extensibility like this you are
> > undermining static checking and performance. If you opt for complete
> > extensibility then
>
> You've not pointed out any problems with static checking in Nice.
> Indeed there's a soundness proof for ML-sub. And we wouldn't want to
> use a language without a soundness proof for its type system, would
> we? ;-)

You said it cannot support exhaustiveness checking because it cannot express
closed sum types? That means it cannot handle the single most important
construct found in the entire ML family of languages and Haskell, which would
seem to be a severe deficiency.

> > that basically obviates all static checking and you can kiss goodbye to
> > competitive performance. In all of my time coding, I have never used the
>
> Depends what you mean by "competitive performance". Nice performance
> is generally within a factor of two of Java's according to the
> language shootout, and its dispatch mechanism could still use a fair
> bit of optimising as I understand it (which, I suspect, it will never
> receive).

Java is generally within a factor of two of OCaml.

Jon Harrop

unread,
Nov 11, 2007, 2:17:19โ€ฏPM11/11/07
to jvm-la...@googlegroups.com

If that were the trade-off I would agree. The problem is that, when you make a
mistake, Haskell's more expressive type system allows it to dream up all
sorts of crazily complicated types to satisfy the incorrect criteria implied
by your buggy code. So you end up with totally incomprehensible type errors
pointing at completely the wrong place. This is solved by adding type
annotations everywhere which bloats all of the code but keeps Haskell usable.

David MacIver

unread,
Nov 11, 2007, 2:27:07โ€ฏPM11/11/07
to jvm-la...@googlegroups.com

David MacIver

unread,
Nov 11, 2007, 2:27:30โ€ฏPM11/11/07
to jvm-la...@googlegroups.com
Gah. Misclick. Please ignore previous message. :-)

Rich Hickey

unread,
Nov 11, 2007, 2:38:54โ€ฏPM11/11/07
to JVM Languages

On Nov 11, 10:34 am, "David MacIver" <david.maci...@gmail.com> wrote:
> On Nov 11, 2007 1:54 PM, Jon Harrop <j...@ffconsultancy.com> wrote:
>
>
>
> > On Sunday 11 November 2007 11:39, David MacIver wrote:
> > > On Nov 10, 2007 8:42 PM, Jon Harrop <j...@ffconsultancy.com> wrote:
> > > > That is certainly a different comparison but I do not understand why you
> > > > would consider it "fairer".
>
> > > Because a) Using explicitly named types has certain advantages, both
> > > in terms of the programmer's usage and encoding invariants in the
> > > program
>
> > Explicit types also have disadvantages. I think it is only fair to consider
> > both approaches.
>
> Certainly. I'm perfectly willing to consider implicit types as a
> valuable thing. But it's an independent discussion.
>
> > > and b) Your claim that this is an advantage of the "ML family
> > > of languages" then actually holds. Currently you're comparing a very
> > > specific feature of OCaml (which its immediate descendant F# has
> > > inherited).
>
> > The ML family of languages basically is OCaml and F# now:
>
> I'm sure that will come as news to the people working on MLTon, Alice, SML/NJ...
>

> >http://ocamlnews.blogspot.com/2007/11/most-popular-functional-languag...


>
> That's using a rather specific metric. I rarely use apt for language
> runtimes/compilers I actually care about because it's usually a few
> versions behind.
>
> > > ...
> > > Nice requires explicit packages for everything. This is a Java
> > > interoperability issue.
>
> > Ok. Do any languages offer multiple dispatch without this baggage?
>
> Do I need to jump up and down and shout 'CLOS' a few more times? :-)
> There are a wide variety of multiple dispatch object systems for
> Scheme as well. Clojure appears to offer one, but I don't know what
> its approach to packages is. In general, multiple dispatch seems
> rather popular in the Lisp world.
>

FWIW, here's one way to do it using Clojure's multimethods: (http://
clojure.sourceforge.net/reference/multimethods.html)

(defmulti d (fn [f x] (f 0)))
(defmethod d 'var [v x] ['int (if (eql? (v 1) x) 1 0)])
(defmethod d 'int [i x] ['int 0])
(defmethod d 'add [a x] ['add (d (a 1) x) (d (a 2) x)])
(defmethod d 'mul [m x] ['add ['mul (m 1) (d (m 2) x)] ['mul (m 2) (d
(m 1) x)]])


> If you want a statically typed version there's also Needle (http://www.nongnu.org/needle/), but I've never used it and the

> with Haskell's ...
>
> read more ยป

hlovatt

unread,
Nov 11, 2007, 3:22:53โ€ฏPM11/11/07
to JVM Languages
@Jon,

The examples you are giving are the same in nature as previous
examples and hence the translation is the same, hence I haven't
translated the new examples. (Although I could code the simplify
example really well using a different technique.) I find it strange
that you like me don't favour multiple function definitions, most
functional programmers would write factorial as:

int fac( 0 ) { return 1; }
int fac( int x ) { return x * fac( x - 1 ); }

(Please forgive the Javanese, adding overloading on values - which
would be a nice addition to Java along with adding methods to
classes.) Rather than:

int fac( int x ) { return x == 0 ? 1 : x * fac( x - 1 ); }

This is the same technique; multiple definitions rather than a
decision tree.

Most of your arguments seem to be around short syntax not about the
semantics of the language. If shorter syntax is of such importance
then add some sugar, for example you could lift ML syntax for class
variables (Scala and Fortress do), e.g.:

class Lit( public final double value ) extends AbstractExp {}

instead of:

public class Lit extends AbstractExp {
public final double value;
public Lit( final double value ) { this.value = value; }
}

I think this sort of sugar is a good idea for common cases, but I
prefer a clear syntax to a short syntax so wouldn't strive for short
syntax except in common cases. As a general rule keywords are clearer
than symbols, e.g. extends in Java rather than : in C++. A further
point is that once you have added in comments and moved to real
programs that are written with clarity and maintenance in mind,
instead of examples, the difference in length between programming
languages tends to diminish. Most people these days use an IDE so
typing isn't a problem, the IDE does it. The examples given are in PEC
which is intended to use existing Java syntax, i.e. no sugar, so that
it is compatible with existing tools, so with no extra syntax it is
hardly surprising that the examples are longer.

Moving back to examples, but this time semantic in nature rather than
syntactic. Suppose that I was writing a program and using the symbolic
math library Exp that was from a supplier, I didn't write it. Now in
my program I want to add Pow, but even if I have the code for Exp it
isn't a good idea to change Exp since I will have introduced a fork
(you don't even want to recompile Exp - just use a binary). With
multiple dispatch you can:

public class Pow extends AbstractExp {
public final Var var;
public final int pow;
public Pow( Var var, int pow ) {
this.var = var;
this.pow = pow;
}
public String toString() { return var + "^" + pow; }
public final static Exp d( Pow p, Var x ) {
if ( p.var != x ) { return new Lit( 0 ); }
return new Mul( new Lit( p.pow ), new Pow( p.var, p.pow - 1) );
}
}

This is hard (I wouldn't say impossible since I don't know ML well
enough) in the closed world view of ML semantics.

In summary: introduce some sugar for common cases; but don't go
overboard, because the semantics are the really important thing.

Cheers,

Howard.

Jon Harrop

unread,
Nov 11, 2007, 7:07:28โ€ฏPM11/11/07
to jvm-la...@googlegroups.com
On Sunday 11 November 2007 20:22, hlovatt wrote:
> Most of your arguments seem to be around short syntax not about the
> semantics of the language. If shorter syntax is of such importance
> then add some sugar,

The lack of features like inference and automatic generalization is the source
of verbosity here and you can't address that with sugar: it is a deficiency
in the language.

> Moving back to examples, but this time semantic in nature rather than
> syntactic. Suppose that I was writing a program and using the symbolic
> math library Exp that was from a supplier, I didn't write it. Now in
> my program I want to add Pow, but even if I have the code for Exp it
> isn't a good idea to change Exp since I will have introduced a fork
> (you don't even want to recompile Exp - just use a binary). With
> multiple dispatch you can:
>
> public class Pow extends AbstractExp {
> public final Var var;
> public final int pow;
> public Pow( Var var, int pow ) {
> this.var = var;
> this.pow = pow;
> }
> public String toString() { return var + "^" + pow; }
> public final static Exp d( Pow p, Var x ) {
> if ( p.var != x ) { return new Lit( 0 ); }
> return new Mul( new Lit( p.pow ), new Pow( p.var, p.pow - 1) );
> }
> }
>
> This is hard (I wouldn't say impossible since I don't know ML well
> enough) in the closed world view of ML semantics.
>
> In summary: introduce some sugar for common cases; but don't go
> overboard, because the semantics are the really important thing.

Jacques Garrigues paper on code reuse describes one way of doing that in OCaml
(using polymorphic variants) but I've never encountered that problem (the
expression problem) in practice and all solutions to it are necessarily
invasive.

hlovatt

unread,
Nov 12, 2007, 1:47:48โ€ฏAM11/12/07
to JVM Languages
@Jon,

I would like to see more type inference added to Java, something along
the lines that Scala or Fortress does. I am not sure about the
extensive inference common in functional languages; I find the error
messages poor. In fact I have suggested extensions including more
inference already:

http://www.artima.com/weblogs/viewpost.jsp?thread=182412

Changing subject: an example of the expression problem that is
pertinant to discussing computer languages is if you write a parser
for a language and then want to expand the language at a latter date.
In this case you might need to expand the number of expressions to
cope with new language features. I note that there is currently much
work on Haskell to remove this 'closed world' limitation from that
language, so I assume I am not alone if finding 'closed worlds' a
problem.

Cheers,

Howard.

Jon Harrop

unread,
Nov 12, 2007, 6:18:58โ€ฏAM11/12/07
to jvm-la...@googlegroups.com
On Monday 12 November 2007 06:47, hlovatt wrote:
> @Jon,
>
> I would like to see more type inference added to Java, something along
> the lines that Scala or Fortress does. I am not sure about the
> extensive inference common in functional languages; I find the error
> messages poor.

The error messages in functional languages vary wildly from implementation to
implementation. In fact, the languages themselves largely dictate how bad the
error messages can get: the more expressive the type system the more easily
it can be misled. Consequently, it is common to add type declarations when
writing OO code in OCaml, for example.

Scala's error messages are good but its inference capabilities are poor: worse
than SML, OCaml, F# and Haskell.

> In fact I have suggested extensions including more inference already:
>
> http://www.artima.com/weblogs/viewpost.jsp?thread=182412

My preference would be to simply forget about Java completely. I'd rather see
a modern functional programming language on the JVM with none of that
baggage. F# has done a good job of providing interoperability with
existing .NET libraries via OOP and active patterns.

Is the motivation of tweaking Java simply that existing Java programmers might
find it easier to learn a new language that looks similar?

> Changing subject: an example of the expression problem that is
> pertinant to discussing computer languages is if you write a parser
> for a language and then want to expand the language at a latter date.
> In this case you might need to expand the number of expressions to
> cope with new language features. I note that there is currently much
> work on Haskell to remove this 'closed world' limitation from that
> language, so I assume I am not alone if finding 'closed worlds' a
> problem.

You might like to look at how camlp4 solved that problem for OCaml by
providing an extensible grammar so that users can add syntax extensions to
OCaml. However, I'm not sure that has any practical applications beyond
writing macros systems for languages with non-trivial syntax (which is why
camlp4 does it).

John Cowan

unread,
Nov 12, 2007, 8:10:46โ€ฏAM11/12/07
to jvm-la...@googlegroups.com
On Nov 12, 2007 6:18 AM, Jon Harrop <j...@ffconsultancy.com> wrote:

> Is the motivation of tweaking Java simply that existing Java programmers might
> find it easier to learn a new language that looks similar?

It's no accident that Java itself looks much like C++, though
its semantics are far more like Smalltalk with static typing
bolted on.

Syntax is as important to programming languages as
appearance, dressing, and grooming are to people; that is,
it determines about 99% of accept/reject decisions.
The aggregate of those decisions determines whether
a language dies, remains a niche language forever
(or until its users die), or grows. Your choice.

Currently, the MLs and the Lisps seem to be niche
languages without a niche; you can use them for anything,
but almost no one wants to.

--
GMail doesn't have rotating .sigs, but you can see mine at
http://www.ccil.org/~cowan/signatures

David MacIver

unread,
Nov 12, 2007, 9:20:51โ€ฏAM11/12/07
to jvm-la...@googlegroups.com
On Nov 11, 2007 7:11 PM, Jon Harrop <j...@ffconsultancy.com> wrote:
> > > http://ocamlnews.blogspot.com/2007/11/most-popular-functional-languages-o
> > >n.html
> >
> > That's using a rather specific metric. I rarely use apt for language
> > runtimes/compilers I actually care about because it's usually a few
> > versions behind.
>
> There would have to be thousands of people installing MLton from source but
> not OCaml from source for that to be significant. I doubt that is the case.
> In fact, I suspect OCaml's lead is larger than those results indicate because
> most SML programmers will have more than one of SMLNJ, MLton and MosML
> installed.

Ok. I concede the point. I don't know enough about the ML community to
comment further.

> > > > `*` (Expr<'a> x, Mult<'a> u v) = (x * u) * v;
> > >
> > > This is relying upon the ability to destructure one level only, right?
> >
> > There's in principle no reason why you can't nest this arbitrarily.
> > It's just that the implementation gets more complicated. This code
> > didn't seem to need it.
>
> I think if you do generalize that then you literally have pattern matching.

To an extent that's true, yes. But it supports a richer set of
relationships between the patterns than plain pattern matching does
because:

a) Many patterns are types in their own right
b) Patterns can have inheritance relationships among them that aren't
possible in OCaml.

But a certain amount of what goes on in there doesn't need any pattern
matching as often you're dispatching on patterns which can't fail.

> Ok. That is one of the design flaws in Scala that cost it normously in
> productivity, IMHO.

That seems an odd opinion, given that in Scala you can get
exhaustiveness checking by declaring a class to be sealed. Similarly,
there's no reason you couldn't do the same in Nice or another multiple
dispatch language, it's just that Nice doesn't support that.

> > Now suppose we wanted to define (for example. I can't actually think
> > of a use for it :-) ):
> >
> > (Sequence<'a> 'b, Sequence<'a> 'c) splits ('a -> bool f, 'b x, 'c y) =
> > (split(f, x), split(f, y))
>
> OCaml programmers have been exposed to a solution to that problem for several
> years but it never gained traction. Specifically, Jacques Garrigue published
> a library that does exactly this using OCaml's OOP to perform the run-time
> dispatch for functions over containers (exactly what you are talking about).
>
> One of the main advantages is that you can now invoke members like "fold"
> without an implicit type. So the old code:
>
> let list_of_array a =
> Array.fold_right (fun h t -> h::t) a []
>
> may be written in a generalized form:
>
> let list_of a =
> a#fold_right (fun h t -> h::t) []
>
> This alternative was not adopted for three reasons. Firstly, this incurs a
> performance overhead that can be very significant (e.g. for empty arrays in
> the above case the time taken is dominated by the

The fact that the performance overhead is high compared to operations
which are going to be blindingly fast anyway doesn't really strike me
as a major problem. It's certainly not a reason to ditch a generally
useful feature.

Similarly, the fact that the performance overhead is high when you've
got no special compiler or runtime support for it strikes me as
unsurprising.

Additionally, this looks like single dispatch, not multiple dispatch,
unless I'm totally misunderstanding OCaml's object system (which is
very possible. I've never used it). If you didn't have a fold_right
method would you be able to add one? (In particular, one which
dispatches on the type). And how would you handle dispatching on the
type of more than one argument? Suppose I wanted to define

(Sequence<'a> 'b, Sequence<'a> 'c) 'b concat('b x, 'c y)

Which uses fold to create a new 'b in the generic case.

Suppose now I wanted to specialise to the case of two arrays

Array<'a> concat(Array<'a> x, Array<'a> y);

In order to take advantage of faster array creation and copying routines.

Would you be able to do this?

dynamic dispatch to the
> fold_right member). Secondly, in the vast majority of practically important
> cases you can obtain the same genericity by simply passing the relevant
> function to a higher-order function:
>
> let list_of fold_right a =
> fold_right (fun h t -> h::t) a []

Hurray for manually constructing and passing around dispatch tables?
Additionally, for more complicated things like numeric type classes
you're going to have to start passing around records of functions in
order to make this work, so it's going to get even worse.

> In more complicated cases, you can parameterize over the container type using
> functors rather than higher-order functions. Finally, in the

Functors are indeed a nice approach in certain cases. I suspect it
gets a bit clumsy when you're trying to combine multiple sequence
types and have to start making use of local imports, etc. Can you show
me how you'd make the split2 example would work with functors?

Even beside that, you're now having to use other language features to
emulate what multiple dispatch manages with exactly the same mechanism
as it does pattern matching. Still think it's an instance of basic
pattern matching?

vast majority of
> cases you know exactly what data structure you're using anyway, so the
> ability to treat something as an arbitrary sequence is basically useless.

This simply isn't true. You're looking at the world through OCaml
coloured glasses. Here, allow me to fix your statement:

"In languages which don't give you a decent facility to abstract over
the data structure being used, you almost always know the exact data
structure you're using anyway".

The sequences example isn't a particularly good one, but take a look
at the wealth of type classes in Haskell. Pretty much every time one
uses a type class they're dealing with a counterexample to that claim.
(Although, conversely, some of these instances would better be handled
with functors). For example, numerics. Having a decent abstraction
over numeric types can be very handy.

Similarly interfaces in Java/C#/similar, traits in Scala, etc.
Regardless of whether you choose to use it, there's a huge wealth of
activity where you want to write generic code.

> The disadvantages of the sequence approach are quite severe: the genericity
> weakens the type system resulting in obfuscated error messages and less
> reliable code. So the approach essentially undermines static typing, which is
> not surprising because you are effectively reinventing dynamic typing.

I've heard this argument before. It's still nonsense. Runtime type
information is not the same as dynamic typing - the compiler is still
able to make exactly the same guarantees that it was able to before
hand. It's just that there's more going on than there was previously.
Additional dynamic information + equally strong proofs that the
program doesn't go wrong does not dynamic typing make.

> So, again, you can solve this problem in OCaml but it is generally not a
> problem worth solving.

*ticks off another box on his language fallacy bingo card*

> > In your version we would need to provide potentially 9 different
> > versions of this! One for each combination of 'finger tree', 'array'
> > and 'anything else'.
>
> Perhaps you didn't mean to choose a trivially decomposable problem but the
> solution in that case is simply:
>
> let split2 f a b = Seq.split f a, Seq.split f b

Which doesn't specialise for arrays, or for finger trees, or for any
other cases you want to add, which is the point I was making.

> > > Are you asking for run-time dispatch to type specialized functions?
> >
> > Yes (Although it is often resolvable at compile time. e.g. if I apply
> > split to something known to be an array at compile time it shouldn't
> > cause a runtime check on the type). That's exactly what multiple
> > dispatch *does*.
>
> Right. The OCaml equivalent will never be resolved statically but it is
> trivial to implement.

It's often trivial to implement worse solutions. :-) Want me to show
you how to implement this badly with Java and the C preprocessor?
(Please don't say yes, I really don't want to have to remember how cpp
works)

> > > > What do you mean by 'depth' here? Are we talking about or patterns or
> > > > nesting constructors? If the former, I think you're right. If the
> > > > latter, see pseudocode.
> > >
> > > I was referring to A(1, B(2, C(3, D 4))) and so on as "deep patterns".
> >
> > Ok. But there's no reason those are noticably harder (except for at
> > the implementation level in order to make them efficient) than
> > destructuring a single level. If I can destructure x in f(x) to be (1,
> > y) then there's no reason I can't destructure y to be (2, z).
>
> I was careful when I chose my example. You are quite correct for a "singleton"
> pattern that necessarily involves no dispatch such as (1, (2, z)) but the
> pattern I gave might require run-time dispatch. If you handle that then you
> are just reinventing pattern matching.

Hm. That's true. Although note (I may have mentioned this before, I
can't remember) that it's a reinvention which translates pretty well,
as you can reimplement the pattern match with nested multimethods.

> As an aside, I don't like the implication that pattern matching over product
> types (what you call "deconstruction") should be treated differently to
> pattern matching over sum types (what you call "multiple dispatch"): this is
> the essence of ML-style pattern matching that makes it so powerful.

The combination of the two is definitely valuable, but they're clearly
different. In particular, destructuring is trivial to implement. :-)
(Nesting other patterns inside it on the other hand isn't).

> > It's incredibly easy to implement - what's difficult is inferring the
> > necessary patterns from each destructuring. In particular I'm trying
> > to point out that destructuring + multiple dispatch is equivalently
> > powerful to (basic, i.e. sans or patterns) pattern matching.
>
> I would say that destructuring + multiple dispatch is a basic form of pattern
> matching (without or-patterns, guards and most of the other features added
> since SML).

It isn't that. It merely contains that as a subset. It supports the
following additional features:

a) Extensibility - both open sum types (which as you've demonstrated
OCaml also supports) and extensibility of individual constructor types
(which as you've demonstrated OCaml doesn't support)
b) Generic functions which can perform a more elaborate dispatch on types.

> > Yes. In the pseudo-code it becomes more competitive (although still
> > longer). (One could, incidentally, cut out a lot of the mess by
> > defining an overload that `*` (Expr<T> x, Int<T> i) = i * x, but that
> > changes the semantic meaning in general even though it's ok here, so I
> > wanted to avoid it).
>
> Can you elaborate on that?

Multiplication is symmetric, so x * y = y * x. Thus we can avoid the
second lot of pattern matching on the Int constructor's arguments by a
single match which swaps it round to reduce it to the first case.

> > Haskell's core type system isn't undecidable as far as I know. Type
> > inference is entirely possible and no less straightforward than in ML
> > with Haskell's core Hindley Milner + type classes type system. It's
> > more a case of readability and explicit public contracts for the
> > module. Haskell's type system + a bazillion add-ons found only in GHC
> > head is of course. ;-)
>
> Haskell supports both polymorphic recursion and arbitrary-rank types (for
> generic monads to work) and both make the type system undecidable without
> manual type annotations.

As clarified in the follow up email, you're misspeaking - the type
system is decidable. The inference problem for it isn't (and not just
in a annoying halting problem sort of way. There are genuinely terms
for which it doesn't make sense to infer a type).

While I'm on this point, I'll respond here:

(from your other email)


> If that were the trade-off I would agree. The problem is that, when you make a
> mistake, Haskell's more expressive type system allows it to dream up all
> sorts of crazily complicated types to satisfy the incorrect criteria implied
> by your buggy code. So you end up with totally incomprehensible type errors
> pointing at completely the wrong place. This is solved by adding type
> annotations everywhere which bloats all of the code but keeps Haskell usable.

This is admittedly true. But the quality of the type error messages
can be improved, and the extra expressivity of those crazily
complicated types (95% of which aren't too crazily complicated at all
unless your name is Oleg) allows you to encode a much richer set of
invariants in your types, reducing the bugs your buggy code produces
when it actually makes it past the demon guardian of the compiler. And
we prefer compiler errors to runtime errors, don't we? ;-) Plus if you
want to adopt a more ML-like style of coding you can simply not use
the features that make the error messages so crazy.

> MLs take the decidable approach and even needlessly prohibit inferred type
> recursion. Consequently, ML code doesn't need superfluous type annotations to
> keep errors under control as Haskell does.

That seems like a losing battle to me. Fully inferable type systems
only go so far.

> > Interesting. I note that the type parameter on expr seems to have
> > become a String. Is this a necessary change? (Note that mine works
> > with any type parameter T because the toString method can accept
> > different types generically).
>
> Sure, I missed that. Its a trivial fix though:
>
> type ('a, 'b) expr =
> ...
> | `Var of 'b ]
>
> etc.
>
> > It also looks rather like you're manually compiling type hierarchies
> > into pattern matching. ;-)
>
> Exactly, yes. The only difference is that you're facing an asymptotic code
> explosion and I'm not.

Indeed, you're not here. It translates better than I expected. On the
other hand, I'm not here either... Merely in other places. And you've
yet to provide a satisfactory account of how to handle other places
(other than "Oh, you shouldn't do that anyway!")

> > Ok. So it isn't extensible at all. It merely allows you to build
> > similar things on top of it,
>
> Extensible means "can expand or add to its capabilities".

Which you can't do. You can use its capabilities to build other ones.

> > and if I were to write a function which
> > accepts an expr it wouldn't be able to accept an expr2, even if the
> > function definition had a perfectly valid catch all term, right?
>
> Wrong. If the function had a "perfectly valid catch all term" then it would
> accept any supertype.
>
> The expr type is a subtype of expr2 so you can pass an expr to a function
> expecting an expr2 and it will work. If you write a generic function like
> this:
>
> # let is_int = function
> | `Int _ -> true
> | _ -> false;;
> val is_int : [> `Int of 'a ] -> bool = <fun>
>
> The argument type of this "is_int" function is a subtype of both expr and
> expr2, so you can pass it values of either type.

Ah, I see. That's quite nice.

> Note how OCaml's polymorphic variants allow the subtyping relationships to be
> inferred: there is no need to manually write out lots of new class
> hierachies.
>
> > So this seems to do significantly less than the multiple dispatch
> > version in terms of allowing for extensibility.
>
> Can you give a specific example?

No, not really. I withdraw the comment now that you've pointed out how
the subtyping works here, although the need to redefine rather than
provide an extension to the string_of method still grates.

> > > Secondly, you have polymorphic variants in OCaml (not SML, Haskell or F#)
> > > that provide overlapping sum types, inferred sum types and open sum
> > > types. These provide the same extensibility as the OOP approach.
> >
> > No they don't. They provide a different set of extensibilities - some
> > more, some less. In particular they provide no support for
> > encapsulation of type (see something like my Sequence example above)
>
> That is supported (see above).

It's supported in the object system, which is kept totally distinct
from pattern matching isn't it?

So, to recap, you've had to break out the following features:

a) Pattern matching (with open sum types)
b) Functors
c) An entire object system

In order to do things which multiple dispatch handles in a uniform way...

> > or for one constructor inheriting from another...
>
> There are no constructors, only functions.

Sure look like constructors to me. You can't pattern match on the
results of an arbitrary function.

> > You've not pointed out any problems with static checking in Nice.
> > Indeed there's a soundness proof for ML-sub. And we wouldn't want to
> > use a language without a soundness proof for its type system, would
> > we? ;-)
>
> You said it cannot support exhaustiveness checking because it cannot express
> closed sum types? That means it cannot handle the single most important

It can support exhaustiveness checking in various ways. It certainly
statically guarantees the exhaustiveness. Abstract methods can be
defined inside a module and any other module creating subtypes has to
satisfy the exhaustiveness criteria for those methods (or declare the
type abstract). See for example the 'symbol' method - it's abstract,
so every subtype of Expr<T> has to provide an implementation. Similar
things can be done with abstract interfaces (the type class analogue).

I'm not sure on the exact constraints on methods outside the module.
You definitely can't exhaustively enumerate the subtypes of a class,
but that's not an inherent property of the concept. Scala's sealed
classes would port directly to Nice.

There seem to be more general ways of supporting exhaustiveness even
without closed sum types, but they require whole program compilation.
I don't really know much about them though.

> > language shootout, and its dispatch mechanism could still use a fair
> > bit of optimising as I understand it (which, I suspect, it will never
> > receive).
>
> Java is generally within a factor of two of OCaml.

Sure. But

a) Within a factor of 4 doesn't seem too bad to me (looking at the
shootout metric it's often less than 2, with the average seeming to be
more like 3). Most of the languages that people are drooling over
these days don't even manage a factor of 10 (take a look at the Ruby
vs OCaml benchmarks for a chuckle).

b) The JVM is really not very well optimised for multiple dispatch,
and the Nice compiler itself could be probably be significantly better
optimised. I'd be surprised if improving these factors couldn't push
Nice to being faster than Java. I doubt it will be faster than OCaml,
but the potential for improved performance is still there. So your
claim that multiple dispatch will cripple performance is clearly not
supported.

John Cowan

unread,
Nov 12, 2007, 10:58:51โ€ฏAM11/12/07
to jvm-la...@googlegroups.com
On Nov 12, 2007 9:20 AM, David MacIver <david....@gmail.com> wrote:

> > There are no constructors, only functions.
>
> Sure look like constructors to me. You can't pattern match on the
> results of an arbitrary function.

I'm going to take this opportunity to mention the Q language
<http://q-lang.sourceforge.net>, OT because not on the JVM.
Q does equational reasoning directly, not as syntactic sugar
for the lambda calculus, and consequently lacks constructor
discipline (as in "bondage and ..."). It's dynamically typed,
eager, and capitalizes variable names.

Consequently, Q constructors are just terms for which there
are no rewrite rules, and an explicit declaration as a constructor
just prevents you from blundering and writing such a rule: it
will cause a compile-time error. Q also supports views, so
you can pattern-match on arbitrary functions of a given type,
though multiple views of the same type are not supported.

Here's my favorite Q micro-demo (though it doesn't involve
types at all): it rewrites arbitrary Boolean expressions with
variables into disjunctive normal form. It should be readable
to anyone familiar with an ML, and even more so to anyone
unfamiliar with one.

// eliminate double negations:
not not A = A;

// push negations inward (de Morgan's laws):
not (A or B) = not A and not B;
not (A and B) = not A or not B;

// distributivity laws:
A and (B or C) = A and B or A and C;
(A or B) and C = A and C or B and C;

// associativity laws:
A and (B and C) = (A and B) and C;
A or (B or C) = (A or B) or C;

> Most of the languages that people are drooling over
> these days don't even manage a factor of 10 (take a look at the Ruby
> vs OCaml benchmarks for a chuckle).

Well, CRuby is known to be significantly pessimized.

David MacIver

unread,
Nov 12, 2007, 11:18:00โ€ฏAM11/12/07
to jvm-la...@googlegroups.com

Yeah, languages based on rewrite rules are interesting. I keep meaning
to have a serious play (as opposed to the mere casual look I've given
them) with one of the various dialects of Obj. But OCaml is very
definitely not one of them. :-)

> > Most of the languages that people are drooling over
> > these days don't even manage a factor of 10 (take a look at the Ruby
> > vs OCaml benchmarks for a chuckle).
>
> Well, CRuby is known to be significantly pessimized.

Sure. I wasn't claiming that Ruby's performance was good. It was
merely a point of comparison. :-)

Jon Harrop

unread,
Nov 12, 2007, 11:57:59โ€ฏPM11/12/07
to jvm-la...@googlegroups.com
On Monday 12 November 2007 14:20, David MacIver wrote:
> > I think if you do generalize that then you literally have pattern
> > matching.
>
> To an extent that's true, yes. But it supports a richer set of
> relationships between the patterns than plain pattern matching does
> because:
>
> a) Many patterns are types in their own right
> b) Patterns can have inheritance relationships among them that aren't
> possible in OCaml.
>
> But a certain amount of what goes on in there doesn't need any pattern
> matching as often you're dispatching on patterns which can't fail.

I see. F# provides a pattern syntax for dynamic type tests:

| :? <baseclass> -> ...

that allows you to catch subtypes. This is used in the handling of
general .NET exceptions (that may form class hierarchies and have subtyping
relationships).

However, this has little practical use outside interoperability with
existing .NET code. That warrants its inclusion in F# but I'm not sure there
would be any advantage in including such a construct in a language like
OCaml.

> > Ok. That is one of the design flaws in Scala that cost it normously in
> > productivity, IMHO.
>
> That seems an odd opinion, given that in Scala you can get
> exhaustiveness checking by declaring a class to be sealed.

Scala's exhaustiveness checking does even work with booleans:

$ ./scala
Welcome to Scala version 2.6.0-final.
Type in expressions to have them evaluated.
Type :help for more information.

scala> def f(x: Boolean) = x match {
| case true => true
| }
f: (Boolean)Boolean

scala> f(false)
scala.MatchError: false
at .f(<console>:4)
at .<init>(<console>:5)
at .<clinit>(<console>)
at RequestResult$.<init>(<console>:3)
at RequestResult$.<clinit>(<console>)
at RequestResult$result(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodA...

> > let list_of a =
> > a#fold_right (fun h t -> h::t) []
> >
> > This alternative was not adopted for three reasons. Firstly, this incurs
> > a performance overhead that can be very significant (e.g. for empty
> > arrays in the above case the time taken is dominated by the
>

> Additionally, this looks like single dispatch, not multiple dispatch,
> unless I'm totally misunderstanding OCaml's object system (which is
> very possible. I've never used it).

Yes.

> If you didn't have a fold_right method would you be able to add one?

Yes.

> (In particular, one which dispatches on the type).

There is no dispatch here. You could add dispatch, tag your values with
the "type" information you require and pattern match on that if you so wish.

> And how would you handle dispatching on the type of more than one argument?

Parallel pattern match over the "type" information of several values at once.

> Suppose I wanted to define
>
> (Sequence<'a> 'b, Sequence<'a> 'c) 'b concat('b x, 'c y)
>
> Which uses fold to create a new 'b in the generic case.
>
> Suppose now I wanted to specialise to the case of two arrays
>
> Array<'a> concat(Array<'a> x, Array<'a> y);
>
> In order to take advantage of faster array creation and copying routines.

You're sacrificing performance by imposing dispatch to enable an optimization.
Whether or not that will pay off is at best unclear.

> Would you be able to do this?

The literal translation is to tag your values with type information in OCaml:

let concat a b = match a, b with
| `Array a, `Array b -> `Array(Array.concat a b)
| a, b -> a#fold a#cons b

In F# you can use the existing run-time type information:

let concat a b =
match a, b with
| (:? 'a array as a), (:? 'a array as b) -> Array.concat a b
| a, b -> Seq.concat a b

You might also break down the specializations by partial application:

let concat a b = match a, b with
| a, `Array b -> a#append_array b
| a, b -> a#fold a#cons b

> > fold_right member). Secondly, in the vast majority of practically
> > important cases you can obtain the same genericity by simply passing the
> > relevant function to a higher-order function:
> >
> > let list_of fold_right a =
> > fold_right (fun h t -> h::t) a []
>
> Hurray for manually constructing and passing around dispatch tables?

There is no dispatch there.

> Additionally, for more complicated things like numeric type classes
> you're going to have to start passing around records of functions in
> order to make this work, so it's going to get even worse.

I already gave the dispatching example:

let list_of a =
a#fold_right (fun h t -> h::t) []

As you can see this is hardly mind-boggling complexity and yet OCaml
programmers choose not to write in this style for the reasons I've explained.

> > In more complicated cases, you can parameterize over the container type
> > using functors rather than higher-order functions. Finally, in the
>
> Functors are indeed a nice approach in certain cases. I suspect it
> gets a bit clumsy when you're trying to combine multiple sequence
> types and have to start making use of local imports, etc. Can you show
> me how you'd make the split2 example would work with functors?

I'll have to think about it. The Set.Make functor in the OCaml stdlib provides
an example of parameterizing a container over another container with a given
interface.

> Even beside that, you're now having to use other language features to
> emulate what multiple dispatch manages with exactly the same mechanism
> as it does pattern matching.

This no longer has anything to do with multiple dispatch: the HOF and functor
approaches parameterise statically rather than dynamically.

> Still think it's an instance of basic pattern matching?

This no longer has anything to do with pattern matching either, for the same
reason.

> > cases you know exactly what data structure you're using anyway, so the
> > ability to treat something as an arbitrary sequence is basically useless.
>
> This simply isn't true. You're looking at the world through OCaml
> coloured glasses.

Note that this is also true in F# which provides native generic sequences and
many operations over them.

> Here, allow me to fix your statement:
>
> "In languages which don't give you a decent facility to abstract over
> the data structure being used, you almost always know the exact data
> structure you're using anyway".

Both of these languages provide ample opportunity to abstract over data
structure. Indeed, they represent the state of the art.

Look at OCaml's private row types, for example:

http://www.math.nagoya-u.ac.jp/~garrigue/papers/privaterows.pdf

> The sequences example isn't a particularly good one, but take a look
> at the wealth of type classes in Haskell. Pretty much every time one
> uses a type class they're dealing with a counterexample to that claim.

Unlike OCaml programmers, Haskell programmers have no alternative in this
respect (unless they move to OHaskell).

> (Although, conversely, some of these instances would better be handled
> with functors). For example, numerics. Having a decent abstraction
> over numeric types can be very handy.

Numerics are an excellent example: virtually no useful functions are generic
over number representation. Even summing a sequence of numbers reliably
requires completely different algorithms for ints and floats.

> Similarly interfaces in Java/C#/similar, traits in Scala, etc.
> Regardless of whether you choose to use it, there's a huge wealth of
> activity where you want to write generic code.

If you want to write generic code I would start with automatic generalization.
None of those languages support even that (including Scala).

> > The disadvantages of the sequence approach are quite severe: the
> > genericity weakens the type system resulting in obfuscated error messages
> > and less reliable code. So the approach essentially undermines static
> > typing, which is not surprising because you are effectively reinventing
> > dynamic typing.
>
> I've heard this argument before. It's still nonsense. Runtime type
> information is not the same as dynamic typing - the compiler is still
> able to make exactly the same guarantees that it was able to before
> hand. It's just that there's more going on than there was previously.
> Additional dynamic information + equally strong proofs that the
> program doesn't go wrong does not dynamic typing make.

I think we are talking at cross purposes here: I was referring to making code
generic over container type whereas you were referring to using multiple
dispatch rather than another technique to make code generic over container
type. You are quite right that things don't get any worse if you have already
made everything generic and extensible. My point is largely that making
everything generic and extensible is a bad idea because it severely
undermines reliability (you lose far too much static checking).

> > Perhaps you didn't mean to choose a trivially decomposable problem but
> > the solution in that case is simply:
> >
> > let split2 f a b = Seq.split f a, Seq.split f b
>
> Which doesn't specialise for arrays, or for finger trees, or for any
> other cases you want to add, which is the point I was making.

Seq.split already does all of the necessary specialization, which is the point
I was making. There is no reason to specialize your split2 function because
it is trivially decomposable.

> > Right. The OCaml equivalent will never be resolved statically but it is
> > trivial to implement.
>
> It's often trivial to implement worse solutions. :-)

How is it worse?

> > I was careful when I chose my example. You are quite correct for a
> > "singleton" pattern that necessarily involves no dispatch such as (1, (2,
> > z)) but the pattern I gave might require run-time dispatch. If you handle
> > that then you are just reinventing pattern matching.
>
> Hm. That's true. Although note (I may have mentioned this before, I
> can't remember) that it's a reinvention which translates pretty well,
> as you can reimplement the pattern match with nested multimethods.

Or you can get the compiler to do it for you...

> > > It's incredibly easy to implement - what's difficult is inferring the
> > > necessary patterns from each destructuring. In particular I'm trying
> > > to point out that destructuring + multiple dispatch is equivalently
> > > powerful to (basic, i.e. sans or patterns) pattern matching.
> >
> > I would say that destructuring + multiple dispatch is a basic form of
> > pattern matching (without or-patterns, guards and most of the other
> > features added since SML).
>
> It isn't that. It merely contains that as a subset.

Then it implements closed sum types and exhaustiveness and redundancy
checking?

> It supports the
> following additional features:
>
> a) Extensibility - both open sum types (which as you've demonstrated
> OCaml also supports) and extensibility of individual constructor types
> (which as you've demonstrated OCaml doesn't support)

What do you mean by "extensibility of individual constructor types"?

> b) Generic functions which can perform a more elaborate dispatch on types.

Your notion of dispatch on "type" is another special case of pattern matching
over a sum type.

> > > Yes. In the pseudo-code it becomes more competitive (although still
> > > longer). (One could, incidentally, cut out a lot of the mess by
> > > defining an overload that `*` (Expr<T> x, Int<T> i) = i * x, but that
> > > changes the semantic meaning in general even though it's ok here, so I
> > > wanted to avoid it).
> >
> > Can you elaborate on that?
>
> Multiplication is symmetric, so x * y = y * x. Thus we can avoid the
> second lot of pattern matching on the Int constructor's arguments by a
> single match which swaps it round to reduce it to the first case.

Ok. Those are called "views" in the context of pattern matching. F# integrates
them as "active patterns" and there is a library providing them for OCaml.

> > If that were the trade-off I would agree. The problem is that, when you
> > make a mistake, Haskell's more expressive type system allows it to dream
> > up all sorts of crazily complicated types to satisfy the incorrect
> > criteria implied by your buggy code. So you end up with totally
> > incomprehensible type errors pointing at completely the wrong place. This
> > is solved by adding type annotations everywhere which bloats all of the
> > code but keeps Haskell usable.
>
> This is admittedly true. But the quality of the type error messages
> can be improved, and the extra expressivity of those crazily
> complicated types (95% of which aren't too crazily complicated at all
> unless your name is Oleg) allows you to encode a much richer set of
> invariants in your types, reducing the bugs your buggy code produces
> when it actually makes it past the demon guardian of the compiler. And
> we prefer compiler errors to runtime errors, don't we? ;-) Plus if you
> want to adopt a more ML-like style of coding you can simply not use
> the features that make the error messages so crazy.

Then you will suffer obfuscated error messages, as I explained.

> > MLs take the decidable approach and even needlessly prohibit inferred
> > type recursion. Consequently, ML code doesn't need superfluous type
> > annotations to keep errors under control as Haskell does.
>
> That seems like a losing battle to me. Fully inferable type systems
> only go so far.

If that were true, OCaml would enable its -rectypes option by default and we
would all be using theorem provers to write our everyday code. In reality,
there is a trade-off between more expressive type systems and less
comprehensible error messages.

> > Exactly, yes. The only difference is that you're facing an asymptotic
> > code explosion and I'm not.
>
> Indeed, you're not here. It translates better than I expected. On the
> other hand, I'm not here either... Merely in other places. And you've
> yet to provide a satisfactory account of how to handle other places
> (other than "Oh, you shouldn't do that anyway!")

I think my account was perfectly satisfactory: you want to dispatch over
multiple "types" which corresponds to a parallel pattern match over a
polymorphic variant. The code is almost identical to the complete example
that I posted. The only difference is that you pattern match over two values
in parallel instead of one.

My point is simply that OCaml has been capable of this for many years (it is
not a new idea) and some of the compiler writers even wrote libraries to help
with this approach but nobody adopted the solution.

> > > Ok. So it isn't extensible at all. It merely allows you to build
> > > similar things on top of it,
> >
> > Extensible means "can expand or add to its capabilities".
>
> Which you can't do. You can use its capabilities to build other ones.

I even gave an example of expanding its capabilities and cited a paper proving
that the two approaches are equivalent.

> > # let is_int = function
> >
> > | `Int _ -> true
> > | _ -> false;;
> >
> > val is_int : [> `Int of 'a ] -> bool = <fun>
> >
> > The argument type of this "is_int" function is a subtype of both expr and
> > expr2, so you can pass it values of either type.
>
> Ah, I see. That's quite nice.

OCaml's structurally subtyped OO system is a nice complement to its very
statically typed ML side. However, OCaml's OO is rarely used for the reasons
I've cited.

> > Note how OCaml's polymorphic variants allow the subtyping relationships
> > to be inferred: there is no need to manually write out lots of new class
> > hierachies.
> >
> > > So this seems to do significantly less than the multiple dispatch
> > > version in terms of allowing for extensibility.
> >
> > Can you give a specific example?
>
> No, not really. I withdraw the comment now that you've pointed out how
> the subtyping works here, although the need to redefine rather than
> provide an extension to the string_of method still grates.

Yes.

> > > No they don't. They provide a different set of extensibilities - some
> > > more, some less. In particular they provide no support for
> > > encapsulation of type (see something like my Sequence example above)
> >
> > That is supported (see above).
>
> It's supported in the object system, which is kept totally distinct
> from pattern matching isn't it?
>
> So, to recap, you've had to break out the following features:
>
> a) Pattern matching (with open sum types)

This should be:

a) Polymorphic variants

> b) Functors
> c) An entire object system

You only need to choose one of those approaches.

> In order to do things which multiple dispatch handles in a uniform way...

Only (a) is equivalent to multiple dispatch. The others offer:

b) static resolution and checking.
c) structural subtyping.

> > > or for one constructor inheriting from another...
> >
> > There are no constructors, only functions.
>
> Sure look like constructors to me.

Naive constructors are replaced by literals and non-trivial constructors by
functions that use literals internally:

class Vec2 {
double x;
double y;
new Vec2(double a, double b) : x(a), y(b) {}
};

Vec2 r = new Vec2(3.0, 4.0);

becomes:

type vec2 = { x: float; y: float }
let r = { x = 3.0; y = 4.0 }

> You can't pattern match on the results of an arbitrary function.

How do you mean?

> > > You've not pointed out any problems with static checking in Nice.
> > > Indeed there's a soundness proof for ML-sub. And we wouldn't want to
> > > use a language without a soundness proof for its type system, would
> > > we? ;-)
> >
> > You said it cannot support exhaustiveness checking because it cannot
> > express closed sum types? That means it cannot handle the single most
> > important
>
> It can support exhaustiveness checking in various ways. It certainly
> statically guarantees the exhaustiveness. Abstract methods can be
> defined inside a module and any other module creating subtypes has to
> satisfy the exhaustiveness criteria for those methods (or declare the
> type abstract). See for example the 'symbol' method - it's abstract,
> so every subtype of Expr<T> has to provide an implementation. Similar
> things can be done with abstract interfaces (the type class analogue).
>
> I'm not sure on the exact constraints on methods outside the module.
> You definitely can't exhaustively enumerate the subtypes of a class,
> but that's not an inherent property of the concept. Scala's sealed
> classes would port directly to Nice.

But Scala's sealed classes only cover a subset of ML's functionality, e.g.
they do not even handle booleans correctly let alone products of them and so
forth.

> There seem to be more general ways of supporting exhaustiveness even
> without closed sum types, but they require whole program compilation.
> I don't really know much about them though.

Yes: whole program compilation lets the compiler close your sum types. That is
the last resort of dynamically typed languages for the same reason.

hlovatt

unread,
Nov 13, 2007, 2:30:09โ€ฏAM11/13/07
to JVM Languages
@Jon,

You keep saying that you loose type checking with multiple dispatch -
that isn't true.

Howard.

> ...
>
> read more ยป

David MacIver

unread,
Nov 13, 2007, 5:37:03โ€ฏAM11/13/07
to jvm-la...@googlegroups.com
Jon, I think we're past the point where this argument is doing any
good. I'm going to pack up now, and we'll just have to carry on
disagreeing.

Jon Harrop

unread,
Nov 13, 2007, 6:32:59โ€ฏAM11/13/07
to jvm-la...@googlegroups.com
On Tuesday 13 November 2007 07:30, hlovatt wrote:
> @Jon,
>
> You keep saying that you loose type checking with multiple dispatch -
> that isn't true.

If you make your code very generic, e.g. using multiple dispatch, then you
undermine static checking. That is not specific to multiple dispatch: using
polymorphic variants in the same way also undermines static checking in
exactly the same way.

My advice is to avoid this kind of genericity.

This whole discussion just boils down to the recent fad of trying to solve the
expression problem in OO languages to test their expressiveness. While that
may be an interesting academic exercise, I believe it has no practical
importance.

Jon Harrop

unread,
Nov 13, 2007, 1:32:04โ€ฏPM11/13/07
to jvm-la...@googlegroups.com

Yes, I think that's fine. Still, lots of interesting ideas... :-)

hlovatt

unread,
Nov 13, 2007, 10:42:52โ€ฏPM11/13/07
to JVM Languages
@Jon,

You posted a True-False example and said that this was an example that
open-world languages couldn't find errors in; but I posted the
equivalent PEC code and showed that it produced a similar error
message to ML (including giving an example of what was wrong).
Therefore I would contest that there is at least one multiple-dispatch-
open-world language with just as robust error checking, PEC.

There are plenty of examples of when you want to increase the
expressions available:

1. A syntax tree for a programming language that is evolving - adding
new expressions
2. Adding new numeric types, e.g. if you have int and real and want to
add complex or matrix
3. Adding new list types that might have different performance/memory/
concurrency trade offs
4. It is also useful to provide partial implementations and leave the
user of the API to complete the class - abstract classes
5. etc.

By the way: I am not contesting that Java is perfect, indeed I have
written an expanded version - PEC, or that functional programming
doesn't have its good points, PEC includes immutable types for
example. I would also like to be able to add/remove methods to an
already compiled class, this is one of the nice features of dynamic
languages like Ruby. But I think functional languages like ML and
Haskell have at least as many problems as current OO languages and I
would also probably go with Haskell and Clean as an improvement over
ML, particularly the way they handle mutable data. To me Scala is a
step in the right direction, so is Fortress, and I hope PEC. I also
hope other languages will follow.

Howard.

Reply all
Reply to author
Forward
0 new messages