Newbie polymorphism

Ryan Tarpine

unread,

Aug 22, 2001, 9:43:53 AM8/22/01

to

Could someone give me a simple understanding of why

let len =
let rec length acc = function
[] -> 0
| (h::tail) -> length (acc+1) tail
in length 0;;

has the type ('_a list -> int), but

let len lst =
let rec length acc = function
[] -> 0
| (h::tail) -> length (acc+1) tail
in length 0 lst;;

is ('a list ->int)? I've been learning OCaml as my first functional lang
over the past week, but I still do not understand this yet. Thanks in
advance.

--
Ryan Tarpine, rtar...@hotmail.com
"Technology is a queer thing. It brings you great gifts
in one hand, and it stabs you in the back with the other."
-- C.P. Snow

Ryan Tarpine

unread,

Aug 22, 2001, 10:16:29 AM8/22/01

to

I wrote:

> [] -> 0
> ...
> [] -> 0

Oops, I know, these sould have been [] -> acc. But I still don't know the
answer to my question

Henning Makholm

unread,

Aug 22, 2001, 12:36:34 PM8/22/01

to

Scripsit Ryan Tarpine <rtar...@hotmail.com>

> I've been learning OCaml as my first functional lang over the past
> week, but I still do not understand this yet. Thanks in advance.

I don't know OCaml in detail, but here is an educated guess, based
on a similar behavior in Moscow ML:

Moscow ML version 2.00 (June 2000)
Enter `quit();' to quit.
- fun loop acc [] = acc | loop acc (h::tail) = loop (acc+1) tail
val len1 = loop 0
fun len2 lst = loop 0 lst
;
! Warning: Value polymorphism:
! Free type variable(s) at top level in value identifier len1
> val 'b loop = fn : int -> 'b list -> int
val len1 = fn : 'a list -> int
val 'b len2 = fn : 'b list -> int

I think the '_a notation in your example is analogous to the fact that
Moscow ML does not display an 'a before "len1" in its typing. It means
that the type variable is not generalized each time len1 is mentioned
- the first time it gets a type it has to have the same type for the
entire rest of the session. Indeed:

- len2 [true,false] ;
> val it = 2 : int
- len2 [1,2,3] ;
> val it = 3 : int
- len1 [true,false] ;
! Warning: the free type variable 'a has been instantiated to bool
> val it = 2 : int
- len1 [1,2,3] ;
! Toplevel input:
! len1 [1,2,3] ;
! ^
! Type clash: expression of type
! int
! cannot have type
! bool
-

Why is it so? This is fairly deep, and if your experience with
funtional programming only goes back one week it is likely that
you don't know enough yet to make sense of the rationale. But
here goes:

The reason why the two definitions are type checked differently
is that the definition of len2 is *simpler* than the definition
of len1. In particular, just by looking at its syntactical form
it is clear that *executing* the definition is not going to *do*
anything. It just gives a name to a piece of code.

In contrast, *executing* the defintion of len1 requires us to
apply a function to the number 0 and give the resulting value
the name "len1". The type checker is not very smart, so it
cannot figure out that loop does not really do anything until
it's got both of its argument. For the purpose of type checking
we have to assume that loop may do any kind of strange things
when we call it.

In particular it must be assumed that when we call loop it may
create a reference (or whatever updateable locations are called
in OCaml) whose type includes the element type, and return a
function that uses that reference. If, in that case, we allowed
len1 to be called several times with lists of different types,
the *same* reference cell would be used with different types
throughout its lifetime, and that would compromise type safety.

Here's an example of a malicious loop function (sorry, but I'm
not fluent enough in OCaml to write it in that language):

fun loop n = let val r = ref NONE
fun f [] = n
| (x::xs) = case !r
of NONE => (r := SOME x ; n)
| SOME y => if x=y then 0 else 1
in f end

If len1 was defined in this context it would be possible to trick the
language into comparing values of two different types for equality,
which just might cause the program to crash.

--
Henning Makholm "Manden med det store pindsvin er
kommet vel ombord i den grnne dobbeltdkker."

Simon Helsen

unread,

Aug 22, 2001, 12:38:59 PM8/22/01

to

Hi Ryan,

On 22 Aug 2001, Ryan Tarpine wrote:

>Could someone give me a simple understanding of why
>
>let len =
> let rec length acc = function

> [] -> acc

> | (h::tail) -> length (acc+1) tail
> in length 0;;
>
>has the type ('_a list -> int), but
>
>let len lst =
> let rec length acc = function

> [] -> acc

> | (h::tail) -> length (acc+1) tail
> in length 0 lst;;
>
>is ('a list ->int)? I've been learning OCaml as my first functional lang
>over the past week, but I still do not understand this yet. Thanks in
>advance.

well, you just popped into one of OCaml's (or SML's) less beautiful
properties, I'm afraid. This is not something you ought to worry about in
a first week of functional programming, but I'll try to explain it
briefly. The problem above is summarized by the term "value polymorphism".
A very good explanation on this problem can be found in "ML for the
working programmer, 2nd Edition" by Larry Paulson.

General (Let-) polymorphism does not go well with ML's references (which
allow destructive updates), at least at top-level: in fact, the type
system can be broken if you naively allow both features to exist
simultaneously (see Paulson for examples). Researchers have proposed many
sorts of solutions, mostly unsatisfactory, complicated or badly
interacting with other language features (notably the module system). The
solution finally adopted by both OCaml and SML is a syntactic one and the
fact that it is syntactic causes the funny problem above. Normally, the
type inference engine, tries to find the most general type of a let-bound
expression and then quantifies over its free variables (module those free
in the type-environment, but forget about that here ;-). However, at
top-level, the "value restriction" says that this quantification can only
take place if the bound expression is a (non-reference) value (also called
"non-expansive expression" in SML speak), because such expressions will
definitely not generate references and hence cannot cause the
soundness-problem mentioned above.

In your former example len is bound to an expression which is an
application, namely (length 0). In the latter version, however, len is
bound to an expression which is a function, namely (function lst -> length
0 lst). Note that the notation in your example is syntactic sugar for
this. Now, functions are values, so no value restriction, applications, on
the other hand, are not and hence the value restriction (indicated by the
underscore in the variable '_a)

As you can see, you don't even need references to get into trouble, simply
because the restriction is conservative and dumb, albeit simple (at least,
if you look at the older semantic mechanisms with imperative type
variables and that kind of funny stuff).

Note that the standard solution for the problem above is exactly what you
did: "eta-expansion". (function lst -> length 0 lst) is the eta-expanded
version of (length 0).

As a final remark: in SML the resulting type (in the first case) would be
something like X list -> int, where X is a fresh non-usable type. Trying
to use len would cause a type error. In OCaml, they are a bit more
pragmatic and give it type '_a list -> int. It allows you to instantiate
'_a only once, but no more:

# len;;
- : '_a list -> int = <fun>
# len [true; false];;
- : int = 2
# len ;;
- : bool list -> int = <fun>
# len [1;2];;
Characters 6-7:
This expression has type int but is here used with type bool

I hope this explains your problem!

Kinds Regards,

Simon

Henning Makholm

unread,

Aug 22, 2001, 1:40:08 PM8/22/01

to

Scripsit Simon Helsen <hel...@informatik.uni-freiburg.de>

> As a final remark: in SML the resulting type (in the first case) would be
> something like X list -> int, where X is a fresh non-usable type.

Does the Definition really require that? I was fairly sure that the
behavior of Moscow ML was allowed by the Definition, but I don't have
it around to check.

I *think* also that simply rejecting an expansive definition with
free type variables (as Moscow ML did before 2.00) is legal.

--
Henning Makholm "You are in a little twisting
maze of passages, all different"

Ryan Tarpine

unread,

Aug 22, 2001, 6:41:52 PM8/22/01

to

Henning Makholm wrote:
> ...

> Why is it so? This is fairly deep, and if your experience with
> funtional programming only goes back one week it is likely that
> you don't know enough yet to make sense of the rationale. But
> here goes:

At least you tried ;)

> ... Explanation snipped ...

While it's true, I have only been studying for a week, it does help that
I've been reading and playing with code every day. Both your and Simon's
explanations helped. After you answered I thought to myself, "This is the
kind of questions that should be in a FAQ." Then I realised I never
checked first; lo and behold, there it was :) Still, it wouldn't have been
enough. It helps to know the technical reasons why as well as what one can
do to fix it. However, I think I should bring up the topic that made me
run into this problem. Reading a tutorial I'm not sure where I picked up,
it gives this code:

let fact1 n =
let rec tfact x n =
if n = 0 then x
else tfact (x*n) (n-1) in
tfact 1 n;;

And states "This, however, has the defect that the local recursive
definition is only evaluated after fact1 receives its argument...Moreover,
it is reeavluated each time fact is called. We can change this as follows:"

let fact2 =
... (same tfact)
tfact 1;;

It even goes so far to say that fact2 is 20% faster! Did the technique
cause problems with my function just because it acted on lists instead of
integers? Or am I missing part of the big picture here?

Brian Rogoff

unread,

Aug 23, 2001, 7:43:44 AM8/23/01

to

On 22 Aug 2001, Ryan Tarpine wrote:

> Henning Makholm wrote:
> > ...
> > Why is it so? This is fairly deep, and if your experience with
> > funtional programming only goes back one week it is likely that
> > you don't know enough yet to make sense of the rationale. But
> > here goes:
>
> At least you tried ;)
>
> > ... Explanation snipped ...
>
> While it's true, I have only been studying for a week, it does help that
> I've been reading and playing with code every day. Both your and Simon's
> explanations helped. After you answered I thought to myself, "This is the
> kind of questions that should be in a FAQ." Then I realised I never
> checked first; lo and behold, there it was :) Still, it wouldn't have been
> enough. It helps to know the technical reasons why as well as what one can
> do to fix it. However, I think I should bring up the topic that made me
> run into this problem. Reading a tutorial I'm not sure where I picked up,
> it gives this code:

From that snippet of code, I can tell you where you picked up that tutorial.

http://www.cl.cam.ac.uk/Teaching/Lectures/funprog-jrh-1996/index.html

and for a small fee I can tell you how I knew that ;-).

> let fact1 n =
> let rec tfact x n =
> if n = 0 then x
> else tfact (x*n) (n-1) in
> tfact 1 n;;
>
> And states "This, however, has the defect that the local recursive
> definition is only evaluated after fact1 receives its argument...Moreover,
> it is reeavluated each time fact is called. We can change this as follows:"
>
> let fact2 =
> ... (same tfact)
> tfact 1;;
>
> It even goes so far to say that fact2 is 20% faster! Did the technique
> cause problems with my function just because it acted on lists instead of
> integers?

Yes. You'd have the same problem though if you use this technique on any
polymorphic type (one with type variables, like the 'a in 'a list) since
the value restriction is a pragmantic restriction which occurs because of
interactions between polymorphism and imperative features; no polymorphism,
no problem. There's a very brief discussion of the value restriction on
pages 106-107 of your tutorial. There's also an explanation in the Caml
Expert FAQ, http://caml.inria.fr/FAQ/FAQ_EXPERT-eng.html. Start here
http://caml.inria.fr/FAQ/FAQ_EXPERT-eng.html#polymorphisme and read the
next few sections carefully, and repeatedly, until they make sense or you
fall asleep. If that

> Or am I missing part of the big picture here?

Nope, you're doing great, much better than I did after a week of ML. Keep
it up and in no time you'll have mastered this. Excellent choice of tutorial
IMO too, but you have to be a bit careful since it discusses an older
version of Caml, not Objective Caml. Hopefully there will be an update
soon.

-- Brian

Henning Makholm

unread,

Aug 23, 2001, 7:44:13 AM8/23/01

to

Scripsit Ryan Tarpine <rtar...@hotmail.com>

> let fact2 =
> ... (same tfact)
> tfact 1;;

> It even goes so far to say that fact2 is 20% faster! Did the technique
> cause problems with my function just because it acted on lists instead of
> integers? Or am I missing part of the big picture here?

The difference is that your factorial function never works on anything
but integers anyway. Therefore it is not a problem that the type
checker decides that it can only be used with a single type, because
it only *has* a single type even befor the type checker made that
decision.

The effect surfaces in the case of the list length function because
the definition of that function does not determine uniquely what the
type of the function should be.

Whether or not it's a "problem" depends on whether you actually intend
to use your length function on several different kinds of list.

(I'm not sure about the 20% figure either. The extra cost of fact1
ought to be a constant amount of time and space for each call, so
the larger input you give it (requiring more iterations) the
smaller becomes the difference, relatively).

--
Henning Makholm "Det m vre spndende at bo p
en kugle. Har I nogen sinde besgt de
egne, hvor folk gr rundt med hovedet nedad?"

Simon Helsen

unread,

Aug 23, 2001, 7:44:59 AM8/23/01

to

Hi Henning,

On 22 Aug 2001, Henning Makholm wrote:

>Scripsit Simon Helsen <hel...@informatik.uni-freiburg.de>
>
>> As a final remark: in SML the resulting type (in the first case) would be
>> something like X list -> int, where X is a fresh non-usable type.
>
>Does the Definition really require that? I was fairly sure that the
>behavior of Moscow ML was allowed by the Definition, but I don't have
>it around to check.

First of all, I was a bit sloppy and just let SML "coincide" with SML/NJ,
which is of course not true ;-)

As for the definition, I'm not quite sure. Sections 4.7, 4.8 and rule (15)
pinpoint value polymorphism. The difference between the two approaches
(OCaml/MoscowML and SML/NJ) boils down to whether you allows the
non-quantified type variable (which is not a real variable, but a dummy
type) to be replaced by 'some type' later on. So, in Ocaml/MoscowML, the
compiler looks in the future and changes non-generalised type variables
for a more "useful" type. While, this approach is sound, I'm not very
convinced if this is conform the actual definition.

>I *think* also that simply rejecting an expansive definition with
>free type variables (as Moscow ML did before 2.00) is legal.

sure, but note that SML/NJ does not reject an expansive definition with
free type variables. The free type variables are simply not unifyable.

Regards,

Simon

Andreas Rossberg

unread,

Aug 23, 2001, 7:45:53 AM8/23/01

to

Henning Makholm wrote:
>
> Scripsit Simon Helsen <hel...@informatik.uni-freiburg.de>
>
> > As a final remark: in SML the resulting type (in the first case) would be
> > something like X list -> int, where X is a fresh non-usable type.
>
> Does the Definition really require that? I was fairly sure that the
> behavior of Moscow ML was allowed by the Definition, but I don't have
> it around to check.

Maybe I am a bit disoriented today but my reading of the SML Definition
is that it actually rules out the behaviour of SML/NJ and enforces that
of Moscow ML.

The SML inference rules are non-deterministic. Nothing prevents them
from guessing the correct type for any binding where generalisation
cannot take place due to the value restriction (`correct' meaning the
type the bound id is later used with) - neither in local declarations
nor at the toplevel (*). Of course, an implementation cannot guess but
instead has to introduce free type variables as place holders for yet
undetermined types. However, these type variables have nothing to do
with the type variables occuring in the SML semantics. These `semantic'
type variables and the variables introduced by the type inference
algorithm as place holders for actual types should not be confused. The
latter can stand for the former, but they are something completely
different. The SML semantics does not say anything about these place
holder variables, they are merely a technique for implementing the
non-deterministic inference rules. So there is no way the restrictions
on free type variables in rules 87-89 of the Definition can apply to
them (**).

Of course, this would imply that the intended effect of these
restrictions is not achieved and that the comment in the Definition's
section G.4 on this topic is incorrect, so I am a bit unsure whether I
overlooked something obvious... Can somebody else comment on this?

> I *think* also that simply rejecting an expansive definition with
> free type variables (as Moscow ML did before 2.00) is legal.

This is also illegal if my above argument is correct.

Best regards,

- Andreas

(*) Except that the guess may not be a type that has not been generated
yet because that would collide with the side conditions on type names at
some places in the rules.

(**) Except that they forbid them to be instantiated with anything
containing free semantic variables later on. But there is no way this
could happen in the usual inference algorithm IINM.

--
Andreas Rossberg, ross...@ps.uni-sb.de

"Computer games don't affect kids; I mean if Pac Man affected us
as kids, we would all be running around in darkened rooms, munching
magic pills, and listening to repetitive electronic music."
- Kristian Wilson, Nintendo Inc.

Simon Helsen

unread,

Aug 23, 2001, 9:17:39 AM8/23/01

to

Hi Ryan,

On 22 Aug 2001, Ryan Tarpine wrote:

>let fact1 n =
> let rec tfact x n =
> if n = 0 then x
> else tfact (x*n) (n-1) in
> tfact 1 n;;
>
>And states "This, however, has the defect that the local recursive
>definition is only evaluated after fact1 receives its argument...Moreover,
>it is reeavluated each time fact is called. We can change this as follows:"
>let fact2 =
> ... (same tfact)
> tfact 1;;
>
>It even goes so far to say that fact2 is 20% faster! Did the technique
>cause problems with my function just because it acted on lists instead of
>integers? Or am I missing part of the big picture here?

The problem surely popped up because you were working with generic lists
(in the previous mail), containing free type variables. In the example
above, there are no free type variables, so no value restriction problems.

It is of course true that the closure for the recursive function is built
immediately in the second example, whereas it is rebuilt every time in the
first example, giving some speedup. However, I'm amazed it is 20%
actually! With what compiler was that done? It seems to be very slow in
building closures...

To these conflicting goals, I cannot give a proper answer. However, in the
many thousends of ML code I've written, I rarely bumped into the value
restriction problem and speed-ups were usually obtained in the core
algorithms. Personally, I think it is more a problem of the compiler that
in the above case such a big performance difference pops up. But, perhaps,
it's better if some ML-compiler writer (Stephen?) answers this question!

regards,

Simon

Henning Makholm

unread,

Aug 23, 2001, 11:04:40 AM8/23/01

to

Scripsit Simon Helsen <hel...@informatik.uni-freiburg.de>

> To these conflicting goals, I cannot give a proper answer. However, in the
> many thousends of ML code I've written, I rarely bumped into the value
> restriction problem

I meet it fairly often. But that may depend on coding style. I like to
work a lot with combinators and do thinks like

local
fun singletonlist x = [x]
in
val rowmatrix = singletonlist
val colmatrix = map singletonlist
end

which in most cases works quite fine (with Moscow ML) in a local scope,
but hits me with the value polymorphism restriction in the cases where
I'm trying to implement a signature such as

sig
val rowmatrix : 'a list -> 'a list list
val colmatrix : 'a list -> 'a list list
end

I'm not really complaining - it's easy enough to eta-expand the
definition when the problem actually surfaces. (And in the work
I do, I don't tend to pay a great deal of attention to efficiency,
at least when we get down to constant factors).

--
Henning Makholm "En tapper tinsoldat. En dame i
spagat. Du er en lykkelig mand ..."

Ryan Tarpine

unread,

Aug 23, 2001, 1:37:21 PM8/23/01

to

Brian Rogoff wrote:
> ...

> Nope, you're doing great, much better than I did after a week of ML. Keep
> it up and in no time you'll have mastered this. Excellent choice of
> tutorial IMO too, but you have to be a bit careful since it discusses an
> older version of Caml, not Objective Caml. Hopefully there will be an
> update soon.
>
> -- Brian

I wish somehow I could make one reply to everyone who responded. Like a
bad C++ inheritance tree that joins together multiple classes that
inherited from one parent ;) I am grateful for everyone's help; I feel I
understand everything much better now. I'm glad people watch this list
even though it happens to be slow (for at least the past week when I've
watched it). Where do most ML folks discuss their problems? This group
seems to slow; is there someplace on IRC? Or is it just that simple once
you get the hang of it? :) The reason I came to OCaml was that I soon will
be starting my second math-intensive project, and this time I promised
myself it wouldn't be in C++. My first choice would be Ruby, but
interpreted + math = Zzzzz. I have a folder full of tutorials and example
code, and I think I'm moving through them pretty well. Soon I'm going to
have to stop testing the waters and jump in. I'll try to keep my head
above the water, but in case I can't you should expect me back here ;)

Brian Rogoff

unread,

Aug 23, 2001, 9:21:45 PM8/23/01

to

On 23 Aug 2001, Ryan Tarpine wrote:
> I wish somehow I could make one reply to everyone who responded. Like a
> bad C++ inheritance tree that joins together multiple classes that
> inherited from one parent ;) I am grateful for everyone's help; I feel I
> understand everything much better now. I'm glad people watch this list
> even though it happens to be slow (for at least the past week when I've
> watched it). Where do most ML folks discuss their problems?

Since you've chosen OCaml, I suggest that you join the Caml mailing list,
which is a veritable hotbed of activity compared to this group. That has
pluses and minuses. Go to http://caml.inria.fr/ and look at the bottom of
the page.

> This group seems to slow; is there someplace on IRC?

It's a moderated newsgroup, and I suspect that a lot of people are intimidated
by the academic nature of many of the discussions. I know at least one lurker
(you know who you are! ;-) who hasn't posted yet, and even though he has
chosen SML, I hope we're still friends.

> Or is it just that simple once you get the hang of it? :)

Once you start to understand the lambda calculus and basic type theory, it
starts becoming a lot easier. It's also (IMO) much more rewarding than
studying C++ or Ada, or any mainstream languages, which tend to have
rather ad-hoc type systems. It seems you need to exert as much effort
mastering the idiosyncracies of C++ as I spent on my undergrad math
degree, and that's way too much, since it's not really general knowledge.
The time you spend learning the theory behind ML will not be wasted if you
move on to another FPL.

Welcome aboard, and I hope to be seeing your posts on the caml-list too.

-- Brian

Stephen Weeks

unread,

Aug 23, 2001, 9:22:55 PM8/23/01

to

> X
> X
>
> * Newsgroups: comp.lang.ml
> * From: Simon Helsen <hel...@informatik.uni-freiburg.de>
> * Subject: Re: Newbie polymorphism
> * Date: 23 Aug 2001 13:17:39 GMT
> * Organization: Rechenzentrum der Universitaet Freiburg, Germany
> * Approved: comp-l...@cs.cmu.edu
> * Reply-To: Simon Helsen <she...@acm.org>
>
> ------------------------------------------------------------------------

> Personally, I think it is more a problem of the compiler that
> in the above case such a big performance difference pops up. But, perhaps,
> it's better if some ML-compiler writer (Stephen?) answers this question!

I will comment on my favorite ML compiler only :-). MLton generates identical
code for fact1 and fact2 (exact SML code below). It's easy enough to do in the
case where the compiler inlines the calls to fact1 and fact2 (as in the code
below), as well as the "first" call to the curried tfact. I'm pretty sure those
inlinings will always happen with MLton because they are all small enough, since
the code only performs a call or allocates a closure record. In both fact1 and
fact2, the meat of the code is the body of tfact, which is the same -- the outer
trimmings don't really matter.

fun fact1 n =
let
fun tfact x n =

if n = 0
then x
else tfact (x*n) (n-1)
in
tfact 1 n

end

val _ = fact1 13

val fact2 =
let
fun tfact x n =

if n = 0
then x
else tfact (x*n) (n-1)
in
tfact 1

end

val _ = fact2 15

> ------------------------------------------------------------------------
> Mailgate Follow-Ups:
> Re: Newbie polymorphism, Henning Makholm
> ------------------------------------------------------------------------
> Mailgate References:
> Newbie polymorphism, Ryan Tarpine
> Re: Newbie polymorphism, Henning Makholm
> Re: Newbie polymorphism, Ryan Tarpine
> ------------------------------------------------------------------------

--
Posted from firewa...@magrathea.epr.com [198.3.163.3]
via Mailgate.ORG Server - http://www.Mailgate.ORG

Stephen Weeks

unread,

Aug 23, 2001, 9:25:58 PM8/23/01

to

>
>
>
> * Newsgroups: comp.lang.ml
> * From: Andreas Rossberg <ross...@ps.uni-sb.de>
> * Subject: Free type variables in SML (was: Newbie polymorphism)
> * Date: 23 Aug 2001 11:45:53 GMT
> * Organization: Universität des Saarlandes
> * Approved: comp-l...@cs.cmu.edu
> * Reply-To: Andreas Rossberg <ross...@ps.uni-sb.de>
>
> ------------------------------------------------------------------------

In an attempt to understand this free type variable stuff, I tried out seven
programs (see below) on five SML compilers, to see which compilers accepted
which programs (a warning counts as acceptance). Here are the results.

1.sml 2.sml 3.sml 4.sml 5.sml 6.sml 7.sml
ML Kit 3.9.1 no yes yes no yes yes yes
MLton 20010806 yes yes yes yes yes yes yes
Moscow ML 2.0 no yes yes yes yes yes yes
Poly/ML 4.0 no yes no yes yes no yes
SML/NJ 110.34 yes no no no yes yes no

----------------------------------------
(* 1.sml *)
val id = (fn x => x) (fn x => x)
----------------------------------------
(* 2.sml *)
val id = (fn x => x) (fn x => x)
val _ = id 13
----------------------------------------
(* 3.sml *)
val id = (fn x => x) (fn x => x)
;
val _ = id 13
----------------------------------------
(* 4.sml *)
val id = (fn x => x) (fn x => x)
datatype t = T
val _ = id T
----------------------------------------
(* 5.sml *)
local
val id = (fn x => x) (fn x => x)
in
val _ = id 13
end
----------------------------------------
(* 6.sml *)
val id = (fn x => x) (fn x => x)
val id = ()
----------------------------------------
(* 7.sml *)
val id = (fn x => x) (fn x => x)
val _ = id 13
val id = ()
----------------------------------------

> > I *think* also that simply rejecting an expansive definition with
> > free type variables (as Moscow ML did before 2.00) is legal.
>
> This is also illegal if my above argument is correct.

According to the end of section G.8, it is permissable (but not required) to
reject an expansive topdec with free type variables instead of substituting
monotypes for them to ensure that no free type variables enter the basis.

Based on this, I conclude that according to the definition, the above programs
1, 2, 3, and 4 may be rejected.

On the other hand, section G.8 also states that "implementations should not
reject programs for which successful elaboration is possible". According to
this, I conclude that 5, 6, and 7 must be accepted.

Finally, according to the side conditions on type names (third paragraph of
section 4.10) mentioned by Andreas, I conclude that program 4 must be rejected.

So, according to my interpretation of the definition, here are the allowable
behaviors on the above seven programs.

1 no or yes
2 no or yes
3 no or yes
4 no
5 yes
6 yes
7 yes

Only the ML Kit matches my reading.

The practical conclusion that I draw is if you want portability amongst SML
compilers, don't use value declarations which introduce free type variables at
the top level. I don't think this restriction matters in practice, and have
never run into problems with it.

> ------------------------------------------------------------------------
> Mailgate References:
> Newbie polymorphism, Ryan Tarpine

> Re: Newbie polymorphism, Simon Helsen

> Re: Newbie polymorphism, Henning Makholm
> ------------------------------------------------------------------------
>
>
>
>
>

Andreas Rossberg

unread,

Aug 24, 2001, 9:09:04 AM8/24/01

to

Stephen Weeks wrote:
>
> > > I *think* also that simply rejecting an expansive definition with
> > > free type variables (as Moscow ML did before 2.00) is legal.
> >
> > This is also illegal if my above argument is correct.
>
> According to the end of section G.8, it is permissable (but not required) to
> reject an expansive topdec with free type variables instead of substituting
> monotypes for them to ensure that no free type variables enter the basis.

Yes, but my argument is that this comment is pointless because there
always is an alternative elaboration that does not introduce these free
type variables in the first place. Consequently, all of 1, 2, and 3 must
be accepted. My impression is that this comment is mixing up type
variables in the sense of the semantics with type variables used at the
implementation level for type inference reasons.

[Actually, if my argument makes sense I believe it implies that the idea
of defining principality of environments modulo the choice of free
imperative type variables in the SML'90 Definition (4.12) is pretty
meaningless as well. An environment simply cannot be principal if it
contains undetermined imperative types - a free type variable is never
principal. This again implies that the claim made in 4.12 about the
existence of principal environments for any valid dec is false. The
claim is based on the same confusion about the role of type variables as
above. Or am I completely off track now?]

So according to my interpretation the allowable behaviour is

1 yes
2 yes
3 yes

4 no
5 yes
6 yes
7 yes

And no compiler matches my reading (but MLton comes closest :-).

I am suprised to see that some compilers really reject 4. As has been
noted by Kahrs [1] the illegalness of this example is rather an artefact
of the way type generativity is modelled in the Definition. There is no
deeper reason for it and it is actually tedious to implement.

And I am even more surprised to see that some systems do not accept 6 or
7.

> I don't think this restriction matters in practice, and have
> never run into problems with it.

That might be true, although forbidding free type variables can be
annoying in interactive use.

However, what matters in practice is that eg. SML/NJ already rejects
free type variables at structure level - the following is not accepted
either (I don't know how the ML Kit or PolyML handle it - only have
Moscow ML and MLton at hands):

(* 2'.sml *)
structure X =
struct

val id = (fn x => x) (fn x => x)
val _ = id 13

end

And I bump into that quite frequently. But this is clearly a bug - I
don't recall whether I ever sent a bug report, though, have to check.

Best regards,

- Andreas

[1] Stefan Kahrs
Mistakes and Ambiguities in the Definition of Standard ML
University of Edinburgh, 1993
http://www.cs.ukc.ac.uk/pubs/1993/569/

Ken Rose

unread,

Aug 24, 2001, 1:53:36 PM8/24/01

to

Ryan Tarpine wrote:
Where do most ML folks discuss their problems?

Try comp.lang.functional

- ken

Stephen Weeks

unread,

Aug 24, 2001, 1:54:48 PM8/24/01

to

> > According to the end of section G.8, it is permissable (but not required) to
> > reject an expansive topdec with free type variables instead of substituting
> > monotypes for them to ensure that no free type variables enter the basis.
>

> Yes, but my argument is that this comment is pointless because there
> always is an alternative elaboration that does not introduce these free
> type variables in the first place. Consequently, all of 1, 2, and 3 must
> be accepted. My impression is that this comment is mixing up type
> variables in the sense of the semantics with type variables used at the
> implementation level for type inference reasons.

I assume that the alternative elaborations that you refer to are the ones that
choose monotypes for the unconstrained types instead of type variables (which
are free at the toplevel). I think that this part of the definition is
referring to precisely those alternative elaborations when it says "substituting
monotypes". Hence, I don't see any inconsistency and still think that this part
allows 1, 2, and 3 to be rejected.

I agree that the Definition confuses free type variables with unconstrained
types used in the implementation of unification.

> > I don't think this restriction matters in practice, and have
> > never run into problems with it.
>

> That might be true, although forbidding free type variables can be
> annoying in interactive use.

You can see why I wouldn't notice that :-).

> However, what matters in practice is that eg. SML/NJ already rejects
> free type variables at structure level - the following is not accepted
> either (I don't know how the ML Kit or PolyML handle it - only have
> Moscow ML and MLton at hands):
>
> (* 2'.sml *)
> structure X =
> struct

> val id = (fn x => x) (fn x => x)
> val _ = id 13

> end

Both the ML Kit and Poly/ML accept 2'.sml.

I'd prefer to see the definition be more restrictive, requiring 1, 2, 2', and 3
to be rejected. Then, there is a simple local rule that works in practice (and
that I happen to use :-), which is, do not have expansive declarations with
unconstrained types. I admit that having grown up with SML/NJ may have colored
my opinion.

Andreas Rossberg

unread,

Aug 24, 2001, 5:33:57 PM8/24/01

to

Stephen Weeks <swe...@acm.org> wrote:
>
> > > According to the end of section G.8, it is permissable (but not required) to
> > > reject an expansive topdec with free type variables instead of substituting
> > > monotypes for them to ensure that no free type variables enter the basis.
> >
> > Yes, but my argument is that this comment is pointless because there
> > always is an alternative elaboration that does not introduce these free
> > type variables in the first place. Consequently, all of 1, 2, and 3 must
> > be accepted. My impression is that this comment is mixing up type
> > variables in the sense of the semantics with type variables used at the
> > implementation level for type inference reasons.
>
> I assume that the alternative elaborations that you refer to are the ones that
> choose monotypes for the unconstrained types instead of type variables (which
> are free at the toplevel).

Yes, and that choice is made in rule 34 which decides the type the
whole RHS will have by choosing a type for x (in other examples rule 2
will do the magic). In particular, for example 2 and 3, those rules
can choose types that are consistent with later uses of id. Similarly,
for 1 the rule can choose some type that does not contain a type
variable. In all cases, the restrictions in rules 87-89 are trivially
met and the comment in G.8 does not change anything. Thus there exists
a successful elaboration for each of the examples. So they are valid.

> I think that this part of the definition is
> referring to precisely those alternative elaborations when it says "substituting
> monotypes".

I believe that sentence tries to describe what NJ does: substituting
free
type variables (that have been inferred before) by arbitrary skolem
types. This comment seems to be intended to _relax_ the restrictions
in rules 87-89. But this comment will never apply because in the
alternative, successful elaboration such type variables do not occur
at all.

> Hence, I don't see any inconsistency and still think that this part
> allows 1, 2, and 3 to be rejected.

There is not really an inconsistency - but the restrictions and rules
about free type variables simply have no effect because they can
always be side-stepped by alternative, more permissive elaborations.

Consequently, I disagree with you: 1-3 may not be rejected. And the
only way to implement that behaviour seems to be by doing what Moscow
ML does.

> > That might be true, although forbidding free type variables can be
> > annoying in interactive use.
>
> You can see why I wouldn't notice that :-).

Sure. :-)

> I'd prefer to see the definition be more restrictive, requiring 1, 2, 2', and 3
> to be rejected. Then, there is a simple local rule that works in practice (and
> that I happen to use :-), which is, do not have expansive declarations with
> unconstrained types. I admit that having grown up with SML/NJ may have colored
> my opinion.

Mh, I think I do not like that :-). I happen to use the simple `rule'
that the type-checker ought to be smart enough to figure out anything
that is unambiguous by itself. The annoyence level of the value
restriction still is high enough that way.

Best regards,

- Andreas