[Caml-list] Comparing two things of any two types, in pure OCaml

ol...@pobox.com

unread,

Sep 9, 2006, 2:41:33 AM9/9/06

to caml...@inria.fr

This message shows an example of an open, extensible comparison function
'a -> 'b -> bool.

Diego Olivier Fernandez Pons wrote about a help system associating (by
means of a hashtable) a documentation string to a function. Alas, the
hash function is not perfect and collisions are possible. Therefore,
we need to validate each hashtable hit by physically (==) comparing
the function (whose documentation we need to lookup) against the
target function. The functions to document have generally different
types. This raises two questions: how to store functions of different
types in the same data structure, and how to compare a candidate
function against these functions of different types. The physical
comparison function (==) cannot be used as it is: we need a comparison
function [cmp : 'a -> 'b -> bool] that can take arguments of any two
types, returning false if the argument types are different.

Jacques Garrigue suggested using Obj.repr. He also wrote ``Type
theoretician answer: What you would need to do that transparently
inside the type system is generic functions with dynamics.'' and
mentioned GADT.

It seems however that open polymorphic variants are ideal for the
task. We start with

let myeq (x : [>]) (y : [>]) = false;;

and add one clause, comparing two booleans

let myeq x y = match (x,y) with
(`T'bool x, `T'bool y) -> x = y
| _ -> myeq x y;;

We can add another clause, for int->bool functions:

let myeq x y = match (x,y) with
(`T'int2bool (x : int->bool), `T'int2bool y) -> x == y
| _ -> myeq x y;;

Let's generate some test data

let v1 = true
let v2 x = not x
let v3 f = f 2
let v4 x = x = 1;;

and we are ready for the first test:

let t1 = [
myeq (`T'bool v1) (`T'bool v1);
myeq (`T'int2bool v4) (`T'int2bool v4);
myeq (`T'bool v1) (`T'int2bool v4)
];;

which gives
val t1 : bool list = [true; true; false]

Let us extend myeq once again:

let myeq x y = match (x,y) with
(`T'int2bool'2bool (x : (int->bool)->bool),
`T'int2bool'2bool y) -> x == y
| _ -> myeq x y;;

We can now write the table associating with each function a
documentation string. We use an associative list of sorts:

let docs = [(myeq (`T'bool v1), "bool value");
(myeq (`T'int2bool v4), "v4");
(myeq (`T'int2bool'2bool v3), "v3");
];;

let lookup x docs =
let rec loop = function [] -> None
| ((c,d)::t) -> if c x then Some d else loop t
in loop docs
;;

Another test:

let t2 =
[lookup (`T'int2bool'2bool v3) docs;
lookup (`T'bool v1) docs;
lookup (`T'int2bool v4) docs
];;

which evaluates to
val t2 : string option list = [Some "v3"; Some "bool value"; Some "v4"]

We realize that we forgot about the function v2, which is of the type
bool->bool. So, we extend myeq once again

let myeq x y = match (x,y) with
(`T'bool2bool (x : bool->bool),
`T'bool2bool y) -> x == y
| _ -> myeq x y;;

and extend our documentation

let docs = (myeq (`T'bool2bool v2), "negation")
:: docs;;

let t3 =
[lookup (`T'int2bool'2bool v3) docs;
lookup (`T'bool2bool v2) docs;
lookup (`T'bool v1) docs;
lookup (`T'int2bool v4) docs
];;

which evaluates to
val t3 : string option list =
[Some "v3"; Some "negation"; Some "bool value"; Some "v4"]

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Jacques Garrigue

unread,

Sep 9, 2006, 3:16:21 AM9/9/06

to ol...@pobox.com

Hi Oleg,

> This message shows an example of an open, extensible comparison function
> 'a -> 'b -> bool.
>
> Diego Olivier Fernandez Pons wrote about a help system associating (by
> means of a hashtable) a documentation string to a function. Alas, the
> hash function is not perfect and collisions are possible. Therefore,
> we need to validate each hashtable hit by physically (==) comparing
> the function (whose documentation we need to lookup) against the
> target function. The functions to document have generally different
> types. This raises two questions: how to store functions of different
> types in the same data structure, and how to compare a candidate
> function against these functions of different types. The physical
> comparison function (==) cannot be used as it is: we need a comparison
> function [cmp : 'a -> 'b -> bool] that can take arguments of any two
> types, returning false if the argument types are different.
>
> Jacques Garrigue suggested using Obj.repr. He also wrote ``Type
> theoretician answer: What you would need to do that transparently
> inside the type system is generic functions with dynamics.'' and
> mentioned GADT.

Small comment: By transparently I meant "without any type
annotation". Then I gave a solution using normal datatypes for
annotations, and polymorphic methods, and just mentioned that GADTs
were not useful in this case. Note that my solution cannot directly
use polymorphic variants, as it uses non-regular types, and
polymorphic variants have to be regular. But an interesting question
is whether that solution could be made more modular, to enable
extension with new type constructors.

> It seems however that open polymorphic variants are ideal for the
> task. We start with
>
> let myeq (x : [>]) (y : [>]) = false;;

[...]

> We can add another clause, for int->bool functions:
> let myeq x y = match (x,y) with
> (`T'int2bool (x : int->bool), `T'int2bool y) -> x == y
> | _ -> myeq x y;;

[...]

> Let us extend myeq once again:
> let myeq x y = match (x,y) with
> (`T'int2bool'2bool (x : (int->bool)->bool),
> `T'int2bool'2bool y) -> x == y
> | _ -> myeq x y;;

While such a solution is appealing, there are drawbacks.
1) You have to add a new case for each new function type, rather than
each type constructor.
2) In the open case, there is no static protection against mistyping a
variant constructor. But you can check it dynamically:
let type_handled x = myeq x x
This will return true only if x's type is handled by myeq.
2) Polymorphic variant typing does not always play well with
modularity. For instance, you cannot define an hashtable in a
module, and add new cases no included in the original type in
another module. For this reason, exceptions may be better for
extension across modules.

This said, I love polymorphic variants anyway...

Jacques Garrigue

ol...@pobox.com

unread,

Sep 13, 2006, 5:20:31 AM9/13/06

to garr...@math.nagoya-u.ac.jp

Hello!

> > It seems however that open polymorphic variants are ideal for the
> > task. We start with
> >
> > let myeq (x : [>]) (y : [>]) = false;;
> [...]
> > We can add another clause, for int->bool functions:
> > let myeq x y = match (x,y) with
> > (`T'int2bool (x : int->bool), `T'int2bool y) -> x == y
> > | _ -> myeq x y;;
> [...]
> > Let us extend myeq once again:
> > let myeq x y = match (x,y) with
> > (`T'int2bool'2bool (x : (int->bool)->bool),
> > `T'int2bool'2bool y) -> x == y
> > | _ -> myeq x y;;

> While such a solution is appealing, there are drawbacks.
> 1) You have to add a new case for each new function type, rather than
> each type constructor.
> 2) In the open case, there is no static protection against mistyping a

> variant constructor. But you can check it dynamically...

> This will return true only if x's type is handled by myeq.
> 2) Polymorphic variant typing does not always play well with

> modularity. ... For this reason, exceptions may be better for
> extension across modules.

The drawback #1 is a feature! The intent was to play around the
similarity between such polymorphic variants and dependent sums. The
tag like `T'int2bool almost certainly has a run-time representation --
that's what makes (run-time) matching possible. Let's call the type of
these run-time values typeRep. Therefore, the value
`T'int2bool (x : int->bool)
might be thought of as
let tag = (repr `T'int2bool) : typerep in (tag, x : reflect(tag))
or, in a more familiar notation,
Sigma (tag: typeRep) x : reflect(tag)
where reflect is a function from terms to types. It should be a function:
identical tags correspond to identical types. Then the myeq function
might be thought of as
let myeq x y = open x with (tag1,xv : reflect(tag1)) in
open y with (tag2,yv : reflect(tag2)) in
tag1 = tag2 && (==) [reflect(tag1)] xv yv

where [t] is a type application. Then one may dream of the ways to
generate these tags a bit more systematically (actually, there is: the
compiler has the way to represent types, and in MetaOCaml, these types
can be reified to run-time values).

brogoff

unread,

Sep 14, 2006, 12:39:01 AM9/14/06

to caml...@inria.fr

On Sat, 9 Sep 2006, Jacques Garrigue wrote:
> > Jacques Garrigue suggested using Obj.repr. He also wrote ``Type
> > theoretician answer: What you would need to do that transparently
> > inside the type system is generic functions with dynamics.'' and
> > mentioned GADT.
>
> Small comment: By transparently I meant "without any type
> annotation". Then I gave a solution using normal datatypes for
> annotations, and polymorphic methods, and just mentioned that GADTs
> were not useful in this case. Note that my solution cannot directly

> use polymorphic variants, it uses non-regular types, and

> polymorphic variants have to be regular.

Is that to make type inference tractable? If so, any ideas on what
amount of annotation would be needed to get around this?

> This said, I love polymorphic variants anyway...

I like them a lot, and I often wish there was a dual notion of polymorphic
records in Caml as well. But now I wonder about the non-regular types issue
you bring up; I guess it means that you need plain old sum types too
if you wish to use them.

-- Brian

Diego Olivier Fernandez Pons

unread,

Sep 16, 2006, 2:56:38 PM9/16/06

to caml...@inria.fr

Bonjour,

What does Andreas Rossberg in his SML vs. Caml page

http://www.ps.uni-sb.de/~rossberg/SMLvsOcaml.html

mean when he says that

[Caml] Does not have a proper generic equality
on one hand (1, r) != (1, r), on the other (1, r) = (1, ref 1)

Is this the "problem" he is pointing ?

# let r = ref 1 in (1, r) = (1, r);;
- : bool = true

# let r = ref 1 in (1, r) != (1, r);;
- : bool = true

As far as I understand the only dark corners with structural equality are
border cases (Nan) where compare differs from (=).

[from Pervasives.ml]
external (=) : 'a -> 'a -> bool = "%equal"
external (<>) : 'a -> 'a -> bool = "%notequal"
external compare: 'a -> 'a -> int = "%compare"

external (==) : 'a -> 'a -> bool = "%eq"
external (!=) : 'a -> 'a -> bool = "%noteq"

Diego Olivier

Jacques Garrigue

unread,

Sep 16, 2006, 5:01:46 PM9/16/06

to diego-olivier....@cicrp.jussieu.fr

From: Diego Olivier Fernandez Pons <Diego.FERN...@etu.upmc.fr>

> What does Andreas Rossberg in his SML vs. Caml page
>
> http://www.ps.uni-sb.de/~rossberg/SMLvsOcaml.html
>
> mean when he says that
>
> [Caml] Does not have a proper generic equality
> on one hand (1, r) != (1, r), on the other (1, r) = (1, ref 1)

The word "proper" is a bit strange (SML-centric), but the meaning is
indeed that in ocaml you need two kinds of equalities, while in SML
you can do with one.
You have a taste of the SML way with objects:

# class c =
object val mutable x = 0 method set x' = x <- x' method get = x end;;
class c :
object val mutable x : int method get : int method set : int -> unit end
# new c = new c;;
- : bool = false

Namely, in SML only immutable values are compared structurally,
mutable ones being compared by pointer.
The main advantage is that there is no need for physical equality,
knowing that physical equality sometimes allows to distinguish between
things that should be identical. This is completed by the use of
so-called equality types, so that you cannot use equality on abstract
types, where it may break abstraction.

I personally like the idea of automatically using pointer equality on
mutables. However pointer equality sometimes comes as a handy
optimization for immutable values too, so going a long way to exclude
it from the language may seem a bit harsh.

By the way, I don't think that his statement is connected with the
incoherence between structural and physical equality on Nan, which is
a rather recent phenomenon in ocaml.

Jacques Garrigue

ross...@ps.uni-sb.de

unread,

Sep 17, 2006, 1:11:14 AM9/17/06

to caml...@inria.fr

Diego Olivier Fernandez-Pons wrote:
>
> What does Andreas Rossberg in his SML vs. Caml page
>
> http://www.ps.uni-sb.de/~rossberg/SMLvsOcaml.html
>
> mean when he says that
>
> [Caml] Does not have a proper generic equality
> on one hand (1, r) != (1, r), on the other (1, r) = (1, ref 1)

To clarify: "proper" refers to "generic".

The problem example is the one given on the left-hand side of the table.
Let me elaborate.

OCaml has two equalities: one fully structural, the other physical. Both
work fine in special cases, but neither has the "right" semantics in the
general case, where you deal with an arbitrary mixture of structural and
mutable types. By "right" I mean equality in the standard Leibniz sense,
where you can always replace equals with equals, without changing the
outcome. Neither of OCaml's operators has this property.

More conretely, given

let r = ref 1

then the pairs (1, r) and (1, ref 1) are distinguishable by the effect of
assignment. Still, OCaml's = considers them equal. On the other hand, it
is not observable (by other means) whether the pair (1, r) is constructed
once or twice, but == distinguishes these cases. Consequently, neither
operator implements equality "correctly" on type (int * int ref), because
either delivers a "wrong" result for one of these two examples.

This problem can sometimes make translation from SML (which has Leibniz
equality) to OCaml subtly difficult (that the inverse direction is
potentially difficult is more obvious).

Cheers,
- Andreas

skaller

unread,

Sep 17, 2006, 5:48:51 AM9/17/06

to ross...@ps.uni-sb.de

On Sun, 2006-09-17 at 07:08 +0200, ross...@ps.uni-sb.de wrote:

> This problem can sometimes make translation from SML (which has Leibniz
> equality)

It does? I'm curious how it handles

(a) functions
(b) abstract types

The former can't support other than physical equality
and the latter can only support value equality with a user
supplied routine.

--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

ross...@ps.uni-sb.de

unread,

Sep 17, 2006, 8:46:23 AM9/17/06

to skaller

skaller wrote:
> On Sun, 2006-09-17 at 07:08 +0200, ross...@ps.uni-sb.de wrote:
>
>> This problem can sometimes make translation from SML (which has Leibniz
>> equality)
>
> It does? I'm curious how it handles
>
> (a) functions
> (b) abstract types

By statically ruling out the use of generic equality on them. That's the
purpose of eqtypes in SML.

For the latter, the implementor of the abstract type can choose to allow
generic equality if the abstract equality coincides with representational
equality.

Of course, this solution comes with its own set of problems...

Btw, SML'97 even goes as far as disallowing the use of generic equality on
floats, because IEEE equality does not meet the requirements. You have to
use a special operator Real.== to compare them.

- Andreas