Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Caml-list] Polymorphic recursion

2 views
Skip to first unread message

Loup Vaillant

unread,
Apr 3, 2007, 1:02:57 PM4/3/07
to caml...@yquem.inria.fr
I was reading Okasaki's book, "Purely functionnal data structures",
and discovered that ML (and Ocaml) doesn't support non uniforms
function definitions.

So, even if:

(**)
type 'a seq = Unit | Seq of ('a * ('a * 'a)seq);;
(**)

is correct,

(**)
let rec size = fun
| Unit -> 0
| Seq(_, b) -> 1 + 2 * size b;;
(**)

is not. Here is the error:
#
| Seq(_, b) -> 1 + 2 * size b;;
This expression (the last 'b') has type seq ('a * 'a) but is here used
with type seq 'a
#

Why?
Can't we design a type system which allow this "size" function?
Can't we implement non uniform recursive function (efficiently, or at all)?.

I suppose the problem is somewhat difficult, but I can't see where.

Regards,
Loup Vaillant

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Jeremy Yallop

unread,
Apr 3, 2007, 1:22:20 PM4/3/07
to caml...@yquem.inria.fr
Loup Vaillant wrote:
> I was reading Okasaki's book, "Purely functionnal data structures",
> and discovered that ML (and Ocaml) doesn't support non uniforms
> function definitions.
>
> So, even if:
>
> (**)
> type 'a seq = Unit | Seq of ('a * ('a * 'a)seq);;
> (**)
>
> is correct,
>
> (**)
> let rec size = fun
> | Unit -> 0
> | Seq(_, b) -> 1 + 2 * size b;;
> (**)
>
> is not.

Right. You can write polymorphic-recursive functions if you wrap them
in recursive modules, though:

module rec Size : sig val size : 'a seq -> int end = struct
let rec size = function
| Unit -> 0
| Seq (_, b) -> 1 + 2 * Size.size b
end

Hope this helps,

Jeremy.

Till Varoquaux

unread,
Apr 3, 2007, 1:36:39 PM4/3/07
to Loup Vaillant
There is indeed an issue with polymorphic recursion. It has been shown
that allowing polymorphic recursion in an ML like language would make
the type inference undecidable. One workaround would be for the
typechecker to use type annotations in some given cases. This is
however not the case in Ocaml: type information are either required or
only useful to debug/document some code/"restrict" the type of a given
variable.

Therefor, if you wish to use polymorphic recursion (yep you can) you
might want to use something where you have to specify the type; this
includes records, objects and recursive modules. So this

type 'a seq = Unit | Seq of ('a * ('a * 'a)seq)

type szRec={f:'a.'a seq -> int};;

let size=
let rec s =
{f=function
| Unit -> 0
| Seq(_, b) -> 1 + 2 * s.f b}
in
s.f

might be what you where yearning for.

Cheers,
Till

brogoff

unread,
Apr 3, 2007, 4:04:16 PM4/3/07
to caml...@yquem.inria.fr
On Tue, 3 Apr 2007, Till Varoquaux wrote:
> Therefor, if you wish to use polymorphic recursion (yep you can) you
> might want to use something where you have to specify the type; this
> includes records, objects and recursive modules. So this
>
> type 'a seq = Unit | Seq of ('a * ('a * 'a)seq)
>
> type szRec={f:'a.'a seq -> int};;
>
> let size=
> let rec s =
> {f=function
> | Unit -> 0
> | Seq(_, b) -> 1 + 2 * s.f b}
> in
> s.f
>
> might be what you where yearning for.

What I'd be yearning for would be more like

let rec size : 'a . 'a seq -> int =
fun s ->
match s with Unit -> 0 | Seq(_,b) -> 1 + 2 * (size b)

rather than having to use recursive modules or higher rank polymorphism
of record fields/polymorphic methods.

-- Brian

skaller

unread,
Apr 3, 2007, 9:30:24 PM4/3/07
to brogoff
On Tue, 2007-04-03 at 13:00 -0700, brogoff wrote:
> On Tue, 3 Apr 2007, Till Varoquaux wrote:
> > Therefor, if you wish to use polymorphic recursion (yep you can) you
> > might want to use something where you have to specify the type; this
> > includes records, objects and recursive modules. So this
> >
> > type 'a seq = Unit | Seq of ('a * ('a * 'a)seq)
> >
> > type szRec={f:'a.'a seq -> int};;
> >
> > let size=
> > let rec s =
> > {f=function
> > | Unit -> 0
> > | Seq(_, b) -> 1 + 2 * s.f b}
> > in
> > s.f
> >
> > might be what you where yearning for.
>
> What I'd be yearning for would be more like
>
> let rec size : 'a . 'a seq -> int =
> fun s ->
> match s with Unit -> 0 | Seq(_,b) -> 1 + 2 * (size b)
>
> rather than having to use recursive modules or higher rank polymorphism
> of record fields/polymorphic methods.

Why not:

let fun rec size : 'a . 'a seq -> int =


| Unit -> 0
| Seq(_,b) -> 1 + 2 * (size b)

with a nice camlp4 example for people by nicolas....@gmail.com ?


--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

skaller

unread,
Apr 3, 2007, 9:43:50 PM4/3/07
to brogoff
On Wed, 2007-04-04 at 11:27 +1000, skaller wrote:

> Why not:
>
> let fun rec size : 'a . 'a seq -> int =
> | Unit -> 0
> | Seq(_,b) -> 1 + 2 * (size b)
>
> with a nice camlp4 example for people by nicolas....@gmail.com ?

Opps .. should be

let rec fun ..
and fun ..
and ..
in
let fun ..
and ..

that is, the 'fun' should come after the 'rec' keyword if present.

Alain Frisch

unread,
Apr 4, 2007, 1:28:35 AM4/4/07
to Jeremy Yallop
Jeremy Yallop wrote:
> Right. You can write polymorphic-recursive functions if you wrap them
> in recursive modules, though:
>
> module rec Size : sig val size : 'a seq -> int end = struct
> let rec size = function
> | Unit -> 0
> | Seq (_, b) -> 1 + 2 * Size.size b
> end

Note that you don't even need the "rec" in this example, and that the
same idiom would support mutually recursive functions. To make an
automatic translation simpler, you can simply "open Size" at the
beginning of the structure to avoid rewriting self-references in the
function's body. To support local definitions, you can of course rely on
local modules:

let rec f : 'a 'b. t = E1 in E2

becomes:

let module X = struct
module rec Y : sig val f : t end = struct
open Y
let f = E1
end
open Y
let v = E2
end in
X.v

However, this encoding has an important drawback: you cannot use type
variables currently in scope in t, E1, E2 (as a consequence, we don't
need to explicitly quantify over variables in the function prototype,
the encoding forces all the variables in the function's type to be
universally quantified). By changing the encoding, you can allow
references to those type variables in E2:

let f =
let module X = struct
module rec Y : sig val f : t end = struct
open Y
let f = E1
end
end in
X.Y.f
in
E2


-- Alain

Loup Vaillant

unread,
Apr 4, 2007, 8:55:46 AM4/4/07
to caml...@yquem.inria.fr
Thanks, everybody.
Now, does anyone have an idea of the overheads? I can build syntactic
sugar for all three case (using my "not yet build" Lisp syntax), so
which is more efficient?

Classes?
Recursive modules?
Records?

Thanks,
Loup

2007/4/4, Alain Frisch <Alain....@inria.fr>:

Roland Zumkeller

unread,
Apr 4, 2007, 9:51:40 AM4/4/07
to Loup Vaillant
On 03/04/07, Loup Vaillant <loup.v...@gmail.com> wrote:
> Can't we implement non uniform recursive function (efficiently, or at all)?.

It has actually been done: call ocaml with the option "-rectypes" and
your example will work as is.

Best,

Roland

--
http://www.lix.polytechnique.fr/~zumkeller/

Alain Frisch

unread,
Apr 4, 2007, 11:18:38 AM4/4/07
to Roland Zumkeller
Roland Zumkeller wrote:
> On 03/04/07, Loup Vaillant <loup.v...@gmail.com> wrote:
>> Can't we implement non uniform recursive function (efficiently, or at
>> all)?.
>
> It has actually been done: call ocaml with the option "-rectypes" and
> your example will work as is.

No. The example would typecheck and the following type will be inferred:

val size : ('a * 'a as 'a) seq -> int = <fun>

but since there is no value of type ('a * 'a as 'a) (for some 'a), you
will be able to apply the function only to the value Unit:

# size Unit;;
- : int = 0
# size (Seq (1, Unit));;
This expression has type int * 'a seq but is here used with type
('b * 'b as 'b) * ('b * 'b) seq


This is a good illustration of why -rectypes is not enabled by default.


-- Alain

Alain Frisch

unread,
Apr 4, 2007, 11:24:49 AM4/4/07
to Alain Frisch
Alain Frisch wrote:
> but since there is no value of type ('a * 'a as 'a)

Sorry, there are actually values in this type, but they are all
structurally equal to the result of "let rec x = (x,x) in x".

Stefan Monnier

unread,
Apr 4, 2007, 11:52:17 AM4/4/07
to caml...@inria.fr
> This is a good illustration of why -rectypes is not enabled by default.

In my experience, polymorphic recursion is almost always linked to
datatypes whose recursion is itself polymorphic. As is the case in
Loup's example.

Has there been work in trying to leverage this connection, so that
polymorphic recursion can be automatically inferred by taking advantage of
the implicit type hint provided by the datatype definition?


Stefan

Roland Zumkeller

unread,
Apr 4, 2007, 12:54:26 PM4/4/07
to Alain Frisch
On 04/04/07, Alain Frisch <Alain....@inria.fr> wrote:
> Alain Frisch wrote:
> > but since there is no value of type ('a * 'a as 'a)
>
> Sorry, there are actually values in this type, but they are all
> structurally equal to the result of "let rec x = (x,x) in x".

Yes, but "('a * 'a as 'a) seq" has more than one value (modulo
structural equality), so it works to *some* extent:

# let rec x = (x,x);;
val x : 'a * 'a as 'a =
# size (Seq (x,Unit));;
- : int = 1
# size (Seq (x, Seq (x,Unit)));;
- : int = 3

However, you are right in pointing out that this limited use won't be
helpful for most applications, so I'd suggest:

# let rec size = function
| Unit -> 0
| Seq (_, b) -> 1 + 2 * size (Obj.magic b);;
val size : 'a seq -> int = <fun>

The function can be checked in richer type systems with annotations
(e.g. Coq's), so we know that Obj.magic is not dangerous here.

Roland

--
http://www.lix.polytechnique.fr/~zumkeller/

Alain Frisch

unread,
Apr 4, 2007, 4:00:36 PM4/4/07
to Roland Zumkeller
Roland Zumkeller wrote:
> The function can be checked in richer type systems with annotations
> (e.g. Coq's), so we know that Obj.magic is not dangerous here.

The fact that something is well-typed in Coq does not mean that you can
just translate it to OCaml by adding a few Obj.magic to make the
type-checker happy. OCaml programmers tend to have a rough mental
picture of what the semantic of the Obj module is and what are the
important properties of the runtime representation of values, but they
often only see part of the picture. Do you know what the following piece
of code does?

let () =
let x = if ("a" = "b") then Obj.magic 0 else String.copy "abc" in
for i = 0 to 100000 do ignore (ref [1]) done;
Gc.major ();
print_endline x

Well, if I knew Coq, I could prove that "a" is not equal to "b" and thus
that x is always bound to a valid string. So the Caml code should print
"abc". Right?

No. This code compiled with ocamlopt produces a segfault on my machine.
I remember spending hours (and wasting my boss' precious time) on a bug
I introduced in some code because I thought that Obj.magic 0 and
Obj.magic () are equivalent. The code above show that this is not the
case (if you replace 0 with (), it works fine).

If you don't understand what's going on, you'd better not use the Obj
module. If you know why, there is probably some other dark corner which
you don't understand and that will bite you some day.


In the present case, we have good solutions that don't require Obj.
Unless a strong case is made that performance is not adequate, there is
really no reason to use Obj.

-- Alain

brogoff

unread,
Apr 4, 2007, 4:14:07 PM4/4/07
to caml...@yquem.inria.fr
On Wed, 4 Apr 2007, Alain Frisch wrote:
> In the present case, we have good solutions that don't require Obj.
> Unless a strong case is made that performance is not adequate, there is
> really no reason to use Obj.

I agree with the "don't use Obj" part, but I think the current solutions
are not that good. Encoding with records or objects is unclear, and
while I find the encoding with recursive modules much clearer, that's
still listed as an experimental feature. It also seems a bit heavy,
using a recursive module just to get polymorphic recursion. Providing a
type for the function and having the compiler use it is the right
solution, IMO. It's not as much performance (though that is important!)
as clarity, admittedly subjective, that I am judging the solutions by.

-- Brian

Brian Hurt

unread,
Apr 4, 2007, 7:37:40 PM4/4/07
to Loup Vaillant

On Tue, 3 Apr 2007, Loup Vaillant wrote:

> I was reading Okasaki's book, "Purely functionnal data structures",
> and discovered that ML (and Ocaml) doesn't support non uniforms
> function definitions.
>
> So, even if:
>
> (**)
> type 'a seq = Unit | Seq of ('a * ('a * 'a)seq);;
> (**)

This way lies GADTs, which are a really neat idea, but even the
Haskeller's aren't 100% sure how to implement correctly yet.

In any case, there's a fairly simple work around which does work in the
current type system, which Okasaki describes IIRC. Basically, you just
do:

type 'a tuple = Tuple of 'a tuple * 'a tuple | Leaf of 'a;;

type 'a seq = Unit | Set of 'a tuple * 'a seq;;

which is a bit of a pain, but works.

Brian

Loup Vaillant

unread,
Apr 5, 2007, 4:18:58 AM4/5/07
to Brian Hurt
2007/4/5, Brian Hurt <bh...@spnz.org>:

>
>
> On Tue, 3 Apr 2007, Loup Vaillant wrote:
>
> > I was reading Okasaki's book, "Purely functionnal data structures",
> > and discovered that ML (and Ocaml) doesn't support non uniforms
> > function definitions.
> >
> > So, even if:
> >
> > (**)
> > type 'a seq = Unit | Seq of ('a * ('a * 'a)seq);;
> > (**)
>
> This way lies GADTs, which are a really neat idea, but even the
> Haskeller's aren't 100% sure how to implement correctly yet.
>
> In any case, there's a fairly simple work around which does work in the
> current type system, which Okasaki describes IIRC. Basically, you just
> do:
>
> type 'a tuple = Tuple of 'a tuple * 'a tuple | Leaf of 'a;;
>
> type 'a mono_seq = Unit | Set of 'a tuple * 'a seq;;

>
> which is a bit of a pain, but works.

(note: I have renamed "mono_seq" for disambiguation)

This workaround doesn't work exactly as intended: as Okasaki pointed
at, the binary tree "tuple" is not guaranteed to be balanced.

Therefore, even if we can build a trivial injection of type
(* seq -> mono_seq *), it will not be a surjection as well.

We have lost an invariant, and are forced to maintain it
programmatically. This kind of workaround is well known: we call it
abstract type. What I find cool with polymorphic types is that more
invariants can be checked directly by the type system.
It has two advantages:
-> It is less error prone when writing the module attached to the type.
-> Sometimes, a programmer outside the module can even do pattern
matching on this type and still be guaranteed she will not produce a
single wrong value.

For these reasons,I wanted a way to exploit polymorphic type. You all
provided three. I don't mind the syntactic burden, provided someone
come up a syntactic shortcut.

Loup

Roland Zumkeller

unread,
Apr 5, 2007, 5:34:13 AM4/5/07
to caml...@yquem.inria.fr
On 04/04/07, Alain Frisch <Alain....@inria.fr> wrote:
> The fact that something is well-typed in Coq does not mean that you can
> just translate it to OCaml by adding a few Obj.magic to make the
> type-checker happy.

As I understand Pierre Letouzey's PhD thesis explains how this can be
done. Your example couldn't result from translating a Coq term, since
"String.copy", "ref", and "Gc.major" are not part of its formalism (a
flavour of lambda calculus with inductive types).

> let () =
> let x = if ("a" = "b") then Obj.magic 0 else String.copy "abc" in
> for i = 0 to 100000 do ignore (ref [1]) done;
> Gc.major ();
> print_endline x

> This code compiled with ocamlopt produces a segfault on my machine.

On my machine it prints "$d$d$d$d$d$d$d$d$d$d$d$d$d$d$d" and when
compiled with ocamlc "abc". Is this a bug?

Roland

--
http://www.lix.polytechnique.fr/~zumkeller/

Francois Maurel

unread,
Apr 5, 2007, 5:47:57 AM4/5/07
to Alain Frisch
About unintuitive - and, of course, strongly implementation dependent
- Obj.magic stuff, one can also patch your example into:

===
let magic x =
let y =
Obj.magic x
in
y

let () =
let x = if ("a" = "b") then magic 0 else String.copy "abc" in


for i = 0 to 100000 do ignore (ref [1]) done;
Gc.major ();
print_endline x

===

which is somehow "correct".

The version
===
let magic x =
Obj.magic x

...
===
is not.

François

On 04/04/07, Alain Frisch <Alain....@inria.fr> wrote:

Alain Frisch

unread,
Apr 5, 2007, 5:58:52 AM4/5/07
to Roland Zumkeller
Roland Zumkeller wrote:
> On 04/04/07, Alain Frisch <Alain....@inria.fr> wrote:
>> The fact that something is well-typed in Coq does not mean that you can
>> just translate it to OCaml by adding a few Obj.magic to make the
>> type-checker happy.
>
> As I understand Pierre Letouzey's PhD thesis explains how this can be
> done. Your example couldn't result from translating a Coq term, since
> "String.copy", "ref", and "Gc.major" are not part of its formalism (a
> flavour of lambda calculus with inductive types).

Here is an example which might fall into this class:

let f n =
let x = if ("a" = "b") then Obj.magic 0 else (Some n) in
for i = 0 to 100000 do ignore (Some i) done;
(match x with Some n -> print_int n | None -> ());
in
f 10

It prints 512 on my machine (when compiled with ocamlopt).

> On my machine it prints "$d$d$d$d$d$d$d$d$d$d$d$d$d$d$d" and when
> compiled with ocamlc "abc". Is this a bug?

No, undefined behaviors are not specified.

-- Alain

Daniel de Rauglaudre

unread,
Apr 5, 2007, 6:08:37 AM4/5/07
to caml...@yquem.inria.fr
This simpler example segmentation faults on my machine :

let () =
let x = if "a" = "b" then Obj.magic 0 else String.copy "abc" in
for i = 0 to 100000 do ignore (ref [1]) done;
print_endline x

--
Daniel de Rauglaudre
http://pauillac.inria.fr/~ddr/

0 new messages