Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

6 views

Skip to first unread message

Nov 13, 2007, 6:42:33 PM11/13/07

to caml-list

When one writes

type row = int

type col = int

This allows one to use the type names "row" and "col" as synonyms of

int. But it doesn't prevent one from using a value of type row in the

place of a value of type col. OCaml allows us to enforce row as

distinct from int two ways:

1) Variants:

type row = Row of int

type col = Col of int

Downside: unnecessary boxing and tagging

conversion from row -> int: (fun r -> match r with Row i -> i)

conversion from int -> row: (fun i -> Row i)

2) Functors:

module type RowCol =

sig

type row

val int_of_row : row -> int

val row_of_int : int -> row

type col

val int_of_col : col -> int

val col_of_int : int -> col

end

module Main = functor (RC: RowCol) -> struct

(* REST OF PROGRAM HERE *)

end

Any code using rows and cols could be written to take a module as a

parameter, and because of the abstraction granted when doing so, type

safety is ensured.

Downside: functor overhead, misuse of functors, need to write

boilerplate conversion functions

conversion from row -> int, int -> row: provided by RowCol boilerplate

IS THE FOLLOWING POSSIBLE:

Modify the type system such that one can declare

type row = new int

type col = new int

Row and col would thus become distinct from int, and require explicit

casting/coercion (2 :> row). There would be no runtime overhead for use

of these types, only bookkeeping overhead at compilation.

Downside: compiler changes (hopefully not too extensive)

conversion from row -> int: (fun r -> (r :> int)) (* might need (r : row

:> int) if it's not already inferred *)

conversion from int -> row: (fun i -> (i :> row))

Thoughts? Do any of you use Variants or Functors to do this now? Do

you find this style of typing useful?

E.

_______________________________________________

Caml-list mailing list. Subscription management:

http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list

Archives: http://caml.inria.fr

Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

Bug reports: http://caml.inria.fr/bin/caml-bugs

Nov 13, 2007, 7:08:38 PM11/13/07

to Edgar Friendly, caml-list

There's a simple trick that Steven Weeks introduced to us and that we now

use at Jane Street for this kind of thing.

use at Jane Street for this kind of thing.

You write down a signature:

module type Abs_int : sig

type t

val to_int : t -> int

val of_int : int <- t

end

And then you write concrete module Int that implements this signature. You

can then write:

module Row : Abs_int = Int

module Col : Abs_int = Int

You can now use Row.t and Col.t as abstract types. The boilerplate is

written once, but can be used over and over. I've personally seen more

use-cases for this with strings than with ints (to separate out different

kinds of identifiers)

y

Nov 13, 2007, 7:22:40 PM11/13/07

to Yaron Minsky, caml-list

On Tue, 13 Nov 2007, Yaron Minsky wrote:

> There's a simple trick that Steven Weeks introduced to us and that we now

> use at Jane Street for this kind of thing.

>

> You write down a signature:

>

> module type Abs_int : sig

> type t

> val to_int : t -> int

> val of_int : int <- t

> end

>

> And then you write concrete module Int that implements this signature. You

> can then write:

>

> module Row : Abs_int = Int

> module Col : Abs_int = Int

>

> You can now use Row.t and Col.t as abstract types. The boilerplate is

> written once, but can be used over and over. I've personally seen more

> use-cases for this with strings than with ints (to separate out different

> kinds of identifiers)

That's the best solution I've seen so far.

Otherwise there's still the following:

http://martin.jambon.free.fr/ocaml.html#opaque

which in theory doesn't require new module or type declarations, but it

adds type parameters, which can be confusing.

Martin

--

http://wink.com/profile/mjambon

http://martin.jambon.free.fr

Nov 14, 2007, 2:59:01 AM11/14/07

to Martin Jambon, caml-list, Yaron Minsky

> On Tue, 13 Nov 2007, Yaron Minsky wrote:

>

> >There's a simple trick that Steven Weeks introduced to us and that we now

> >use at Jane Street for this kind of thing.

> >

> >You write down a signature:

> >

> >module type Abs_int : sig

> >type t

> >val to_int : t -> int

> >val of_int : int <- t

> >end

> >

> >And then you write concrete module Int that implements this signature. You

> >can then write:

> >

> >module Row : Abs_int = Int

> >module Col : Abs_int = Int

> >

> >You can now use Row.t and Col.t as abstract types. The boilerplate is

> >written once, but can be used over and over. I've personally seen more

> >use-cases for this with strings than with ints (to separate out different

> >kinds of identifiers)

>

> That's the best solution I've seen so far.

>

> Otherwise there's still the following:

> http://martin.jambon.free.fr/ocaml.html#opaque

>

> which in theory doesn't require new module or type declarations, but it

> adds type parameters, which can be confusing.

>

>

> Martin

>

> >There's a simple trick that Steven Weeks introduced to us and that we now

> >use at Jane Street for this kind of thing.

> >

> >You write down a signature:

> >

> >module type Abs_int : sig

> >type t

> >val to_int : t -> int

> >val of_int : int <- t

> >end

> >

> >And then you write concrete module Int that implements this signature. You

> >can then write:

> >

> >module Row : Abs_int = Int

> >module Col : Abs_int = Int

> >

> >You can now use Row.t and Col.t as abstract types. The boilerplate is

> >written once, but can be used over and over. I've personally seen more

> >use-cases for this with strings than with ints (to separate out different

> >kinds of identifiers)

>

> That's the best solution I've seen so far.

>

> Otherwise there's still the following:

> http://martin.jambon.free.fr/ocaml.html#opaque

>

> which in theory doesn't require new module or type declarations, but it

> adds type parameters, which can be confusing.

>

>

> Martin

In the next version of the compiler, you will have an extension to the

definition of types with private construction functions (aka ``private''

types) that just fulfills your programming concern: in addition to usual

private sum and record private type definitions, you can now define a type

abbreviation which is private to the implementation part of a module (see

note 1).

Since the values of a private type are concrete, it is much easier for the

programmer to use the values of the type abbreviation.

Using the new private type abbreviation feature, you can write for instance:

row.ml

------

type row = int;;

let make i =

if i < 0 then failwith "Row: cannot create a negative row" else

i;;

let from i = i;;

row.mli

-------

type row = private int;;

val make : int -> row;;

val from : row -> int;;

This solution does not require any additional tagging or boxing, only the

usage of injection/projection function for the type. As for usual private

types, the injection function is handy to provide useful invariants (in the

row type example case, we get ``a row value is always a positive integer'').

In addition, the private abbreviation is publicly exposed in the interface

file. Hence, the compiler knows about it: it can (and effectively does)

optimize the programs that use values of type row in the same way as if the

type row were defined as a regular public abbreviation.

Last but not least, being an instance of the identity function, the from

projection function is somewhat generic: we can dream to suppress the burden

of having to write it for each private type definition, if we find a way to

have it automatically associated to the new private type by the compiler.

Best regards,

Note 1: this is a generalization for regular type abbreviations of the

polymorphic variants private type definitions that Jacques Garrigue already

introduced to provide polymorphic variants (and object) types with private

row variables.

--

Pierre Weis

INRIA Rocquencourt, http://bat8.inria.fr/~weis/

Nov 14, 2007, 6:01:52 AM11/14/07

to Edgar Friendly, caml-list

Am Dienstag, den 13.11.2007, 17:41 -0600 schrieb Edgar Friendly:

> When one writes

>

> type row = int

> type col = int

>

> This allows one to use the type names "row" and "col" as synonyms of

> int. But it doesn't prevent one from using a value of type row in the

> place of a value of type col. OCaml allows us to enforce row as

> distinct from int two ways:

>

> 1) Variants:

> type row = Row of int

> type col = Col of int

>

> Downside: unnecessary boxing and tagging

> conversion from row -> int: (fun r -> match r with Row i -> i)

> When one writes

>

> type row = int

> type col = int

>

> This allows one to use the type names "row" and "col" as synonyms of

> int. But it doesn't prevent one from using a value of type row in the

> place of a value of type col. OCaml allows us to enforce row as

> distinct from int two ways:

>

> 1) Variants:

> type row = Row of int

> type col = Col of int

>

> Downside: unnecessary boxing and tagging

> conversion from row -> int: (fun r -> match r with Row i -> i)

Note that you can write much shorter:

(fun Row i -> i), or

let some_function (Row i) = ...

If the small runtime penalty is not essential, this is probably the best

way to do it.

Gerd

--

------------------------------------------------------------

Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany

ge...@gerd-stolpmann.de http://www.gerd-stolpmann.de

Phone: +49-6151-153855 Fax: +49-6151-997714

------------------------------------------------------------

Nov 14, 2007, 6:06:48 AM11/14/07

to caml...@yquem.inria.fr

On Wednesday 14 November 2007 11:01, Gerd Stolpmann wrote:

> (fun Row i -> i), or

> (fun Row i -> i), or

Do you mean:

function Row i -> i

or:

fun (Row i) -> i

--

Dr Jon D Harrop, Flying Frog Consultancy Ltd.

http://www.ffconsultancy.com/products/?e

Nov 14, 2007, 7:37:56 AM11/14/07

to Pierre Weis, caml-list

Pierre Weis wrote:

> type row = private int;;

> type row = private int;;

The only difference with an abstract type is that some generic

operations (comparision, equality) can be optimized, and currently, it

happens only for some basic types. So exporting a private abbreviation

(instead of an abstract type) is useless when the type is not a basic

type. Is it correct?

I find it somewhat disturbing to expose low-level optimization features

as part of the type system. Couldn't a similar thing (specializing

operations on integers) be achieved by always storing manifest types in

cmxs files? (Within a single compilation unit, the compiler could keep

type definitions across module boundaries.)

Alain

Nov 14, 2007, 8:56:33 AM11/14/07

to caml...@yquem.inria.fr

Le mer 14 nov 2007 13:37:24 CET,

Alain Frisch <al...@frisch.fr> a écrit :

Alain Frisch <al...@frisch.fr> a écrit :

> Pierre Weis wrote:

> > type row = private int;;

>

> The only difference with an abstract type is that some generic

> operations (comparision, equality) can be optimized, and currently,

> it happens only for some basic types. So exporting a private

> abbreviation (instead of an abstract type) is useless when the type

> is not a basic type. Is it correct?

>

As far as I understand Pierre's original email, this is not simply

about optimization (which is in fact a simple by-product of the

feature). The main point of private is to distinguish between the

construction of a value (where you might want to enforce some invariant)

and its use (where you might rely on the invariant). You can use a

value of type row everywhere a plain int is expected (in particular for

addition, subtraction, etc.), without having to redefine the

corresponding functions as would have been necessary with an abstract

type. On the other hand, the construction of a value of type row from a

plain int is impossible outside of the constructor function provided in

the module, which checks that the invariant required for row is

satisfied. This can be useful for other types as well, consider for

instance the following module (not tested of course)

module Sorted:

sig

type sorted_list = private int list

val empty: sorted_list

val insert: int -> sorted_list -> sorted_list

end =

struct

type sorted_list = int list

let empty = []

let rec insert x = function

[] -> [x]

| y::_ as l when x < y -> x::l

| y::l -> y :: insert x l

end

A value of type Sorted.sorted_list is a list, for which all operations

of module List are available (as well as pattern-matching), but in

addition it is guaranteed that a sorted_list is sorted, since it must

be built with empty and insert.

--

E tutto per oggi, a la prossima volta.

Virgile

Nov 14, 2007, 9:35:45 AM11/14/07

to caml...@inria.fr

Edgar Friendly <thele...@gmail.com> writes:

> When one writes

>

> type row = int

> type col = int

>

> This allows one to use the type names "row" and "col" as synonyms of

> int. But it doesn't prevent one from using a value of type row in the

> place of a value of type col. OCaml allows us to enforce row as

> distinct from int two ways:

>

> 1) Variants:

> type row = Row of int

> type col = Col of int

>

> Downside: unnecessary boxing and tagging

> conversion from row -> int: (fun r -> match r with Row i -> i)

> conversion from int -> row: (fun i -> Row i)

> When one writes

>

> type row = int

> type col = int

>

> This allows one to use the type names "row" and "col" as synonyms of

> int. But it doesn't prevent one from using a value of type row in the

> place of a value of type col. OCaml allows us to enforce row as

> distinct from int two ways:

>

> 1) Variants:

> type row = Row of int

> type col = Col of int

>

> Downside: unnecessary boxing and tagging

> conversion from row -> int: (fun r -> match r with Row i -> i)

> conversion from int -> row: (fun i -> Row i)

I agree with Gerd, this is probably the least-effort solution. If I'm

not making mistake, even with the new feature introduced by Pierre (Btw,

really nice feature!), you still have to make the restriction at the

module level. I.e., only the outside world of current module will

benefit from this protection, and if you want to use this protection

right now, you still have to wrap type row/col as modules.

You may consider using variant with the syntax convention in Gerd's

post, or you can consider using record. E.g.

type row = {r:int} and col = {c:int}

then the "from" method are just record construction like {c=3} and "to"

method are just field access like x.r.

On efficiency, note that if your single variant (record) has multiple

fields, e.g. "type row = Row of int * int", this encoding doesn't

introduce more (un)boxing than "type row = int * int". The difference

happens when the single variant (record) has single field itself. There

was discussion on this topic before. I do agree that eliminating extra

(un)boxing on single variant type (and single immutable field record

type) would be nice.

> 2) Functors:

> module type RowCol =

> sig

> type row

> val int_of_row : row -> int

> val row_of_int : int -> row

> type col

> val int_of_col : col -> int

> val col_of_int : int -> col

> end

>

> module Main = functor (RC: RowCol) -> struct

> (* REST OF PROGRAM HERE *)

> end

>

If you want to make the protection effective in "current" module, you

anyway needs some module-level abstraction in current OCaml. The following

functor tries to do the simple wrapping for any type, and can be defined

once for all:

module type ANY = sig type t end

module type ABS = sig type t type nat val box: nat -> t val unbox: t -> nat end

module Abstr(T:ANY) : ABS with type nat = T.t = struct

type t = T.t and nat = T.t

let box x = x and unbox x = x

end

(* Test *)

# module Row = Abstr(struct type t = int end)

# module Col = Abstr(struct type t = int end)

# let x = Row.box 3 and y = Col.box 5;;

val x : Row.t = <abstr>

val y : Col.t = <abstr>

# let f x y = Row.unbox x + Col.unbox y;;

val f : Row.t -> Col.t -> int = <fun>

--

Zheng Li

http://www.pps.jussieu.fr/~li

Nov 14, 2007, 9:35:52 AM11/14/07

to Alain Frisch, caml-list, Pierre Weis

Hi Alain,

> Pierre Weis wrote:

> >type row = private int;;

>

> The only difference with an abstract type is that

a private type is not an abstract type but a concrete type.

> some generic operations (comparision, equality) can be optimized, and

> currently, it happens only for some basic types. So exporting a private

> abbreviation (instead of an abstract type) is useless when the type is not

> a basic type. Is it correct?

I don't understand what you mean by ``exporting a private type abbreviation

is useless''. In fact, you cannot create a private abbreviation if you do not

`export'' it and define it as private when exporting, isn't it ?

(Clearly, if you define a private type abbreviation in an implementation

file, this definition is completely useless since you cannot define any

value of the private type).

However, if I interpret what you said as: ``if we only want to optimize the

compilation, there is no point to define a private type instead of an

abstract type in the case of a complex type abbreviation'' then the question

is somewhat equivalent to ask ``in which case the compiler can optimize when

faced to a type abbreviation ?'': the answer is complex but in particular the

compiler can optimize a type t defined as being a (concrete) abbreviation for

a float array or a (concrete) abbreviation for a record of floats.

In any case, the purpose of introducing private abbreviation type definitions

is not to optimize programs but to write clearer and safer programs. In

particular because private data types can ensure invariants the same way as

abstract data types do; and also because private data types support pattern

matching the same way as concrete types do.

Once more, private abbreviation types being concrete, they get all the

advantages of concrete types and all their drawbacks (in particular wrt

abstraction of value representation, evolution of implementation programs,

openness of algorithms, meta informations).

> I find it somewhat disturbing to expose low-level optimization features

> as part of the type system.

No low-level optimization is ``exposed in the type system'': the low level

(type-based) optimizations are just an automatic side effect of a private

data type being indeed a concrete type. In other words, would you say that

you ``expose low level optimization as part of the type system'' if you ever

define a vector as being a float array instead of being a M.t array value

with M.t being an abstract abbreviation for float ?

In some cases, the answer can be yes: you may need to ``unabstract'' types

for the sake of efficiency. But generally speaking, you just use the type

float because it is the type of values you need, the optimizations come then

from the compiler as a very welcome bonus.

> Couldn't a similar thing (specializing operations on integers) be achieved

> by always storing manifest types in .cmxs files? (Within a single

> compilation unit, the compiler could keep type definitions across module

> boundaries.)

That's another story: the problem with this approach is to prevent breaking

the separate compilation feature of the actual compiler.

In any case, private type abbreviations have been implemented as a new useful

tool for the programmer, not as a way to optimize programs. Type-based

compiler's optimizations are just an extra plus, I thought it was worth to

mention that private type abbreviations are fully compatible with these

optimizations.

But in the first place, private type abbreviations are useful to define proper

concrete sub types of another type.

All the best,

-- Pierre

Nov 14, 2007, 11:10:12 AM11/14/07

to Yaron Minsky, caml-list

Yaron Minsky wrote:

> module type Abs_int : sig

> type t

> val to_int : t -> int

> val of_int : int <- t

> end

>

> And then you write concrete module Int that implements this signature. You

> can then write:

>

> module Row : Abs_int = Int

> module Col : Abs_int = Int

>

This approximates my idea of "optimal" well enough that I'll rewrite> module type Abs_int : sig

> type t

> val to_int : t -> int

> val of_int : int <- t

> end

>

> And then you write concrete module Int that implements this signature. You

> can then write:

>

> module Row : Abs_int = Int

> module Col : Abs_int = Int

>

some of my old code to use it. It loses some opportunities for compiler

optimization, but otherwise seems perfect. No repetitive boilerplate to

write over and over, no unnecessary boxing, efficient conversion to the

base type (well, it still requires a function call to find out that

nothing needs to change, and maybe there's a shallow copy created in

this process), easy on the eyes/fingers syntax for conversion and

declaration and a readable type name for ocamlc to report when things go

wrong.

Thanks for sharing this gem. Any chance it can be added to a/the FAQ?

Nov 14, 2007, 11:20:53 AM11/14/07

to Edgar Friendly, caml-list, Yaron Minsky

Edgar Friendly wrote:

>Yaron Minsky wrote:

>

>

>>module type Abs_int : sig

>>type t

>>val to_int : t -> int

>>val of_int : int <- t

>>end

>>

>>And then you write concrete module Int that implements this signature. You

>>can then write:

>>

>>module Row : Abs_int = Int

>>module Col : Abs_int = Int

>>

>>

>>

>This approximates my idea of "optimal" well enough that I'll rewrite

>some of my old code to use it. It loses some opportunities for compiler

>optimization, but otherwise seems perfect. No repetitive boilerplate to

>write over and over, no unnecessary boxing, efficient conversion to the

>base type (well, it still requires a function call to find out that

>nothing needs to change, and maybe there's a shallow copy created in

>this process), easy on the eyes/fingers syntax for conversion and

>declaration and a readable type name for ocamlc to report when things go

>wrong.

>

>

Actually, Ocaml is pretty good at cross-module inlining, so that

normally the function call is optimized out, and the whole conversion

becomes a no-op. Or, at least, such has been my experience.

Brian

Nov 14, 2007, 11:39:28 AM11/14/07

to Pierre Weis, caml-list

Pierre Weis wrote:

> a private type is not an abstract type but a concrete type.

> a private type is not an abstract type but a concrete type.

I guess I got a wrong intuition of the new feature because of the "from"

function in your example. Is it the case that a value of type "abstract

int" can always be used as a value of type "int" automatically (you

mention that it is ok for pattern matching at least)?

-- Alain

Nov 14, 2007, 11:57:39 AM11/14/07

to Pierre Weis, caml-list

Pierre Weis wrote:

> In the next version of the compiler, you will have an extension to the

> definition of types with private construction functions (aka ``private''

> types) that just fulfills your programming concern: in addition to usual

> private sum and record private type definitions, you can now define a type

> abbreviation which is private to the implementation part of a module (see

> note 1).

>

At first glance, this improvement does satisfy my concerns, but I think> In the next version of the compiler, you will have an extension to the

> definition of types with private construction functions (aka ``private''

> types) that just fulfills your programming concern: in addition to usual

> private sum and record private type definitions, you can now define a type

> abbreviation which is private to the implementation part of a module (see

> note 1).

>

it still falls short of optimal in ways that seem easy to fix.

> Since the values of a private type are concrete, it is much easier for the

> programmer to use the values of the type abbreviation.

>

This concerns me - I'd like to require explicit casting/coercion to

create or use the internal value of the abstract type, maybe with an

exception for literals. Could you elaborate on this?

> This solution does not require any additional tagging or boxing, only the

> usage of injection/projection function for the type. As for usual private

> types, the injection function is handy to provide useful invariants (in the

> row type example case, we get ``a row value is always a positive integer'').

>

With a bit of low-level support, I imagine it not difficult to implement

the following:

type row = private int constraint (fun i -> i >= 0)

such that the compiler uses the provided constraint function to check

any (x :> row) casts, throwing an exception(?) on false. This solution

wouldn't involve the module system just to have positive integer types,

and gets rid of the function call overhead on 'from'.

> In addition, the private abbreviation is publicly exposed in the interface

> file. Hence, the compiler knows about it: it can (and effectively does)

> optimize the programs that use values of type row in the same way as if the

> type row were defined as a regular public abbreviation.

>

Does this mean I can do:

let one (r:row) = r+1

let onea (r : row) = r = 1

let two (r:row) (i:int) = r+i

let three : row -> string = function 3 -> "three" | _ -> "not three"

> Last but not least, being an instance of the identity function, the from

> projection function is somewhat generic: we can dream to suppress the burden

> of having to write it for each private type definition, if we find a way to

> have it automatically associated to the new private type by the compiler.

>

The use case for this feature is restricted construction to enforce

invariants, no? Anything more complex should probably be totally

abstract and not simply private. In which case, the constraint keyword

seems an effective way to tell the compiler what to do, and just let :>

coercions do their magic.

E.

Nov 14, 2007, 1:44:23 PM11/14/07

to Alain Frisch, caml-list, Pierre Weis

> Pierre Weis wrote:

> >a private type is not an abstract type but a concrete type.

>

> I guess I got a wrong intuition of the new feature because of the "from"

> function in your example. Is it the case that a value of type "abstract

> int" can always be used as a value of type "int" automatically (you

> mention that it is ok for pattern matching at least)?

> >a private type is not an abstract type but a concrete type.

>

> I guess I got a wrong intuition of the new feature because of the "from"

> function in your example. Is it the case that a value of type "abstract

> int" can always be used as a value of type "int" automatically (you

> mention that it is ok for pattern matching at least)?

If we stick to the row example of positive values, we get:

- a value of type row is in fact a concrete integer (it is not hidden in any

way),

- a value of type row can only be created by the make function defined in the

implementation of the module that defines the private type,

- a value of type row can be projected out of type row to a value of type int

with a ``no-op'' identity function (I called it from in the example).

So, no: a value of type row is not of type int and you need a marker to

indicate the projection (for the time being the marker is a (identity)

function call to let the implemention as simple as possible, but a sub-typing

constraint makes sense and we can provide it if this is considered clearer).

Best regards,

-- Pierre

Nov 14, 2007, 2:19:47 PM11/14/07

to Pierre Weis, caml-list, Alain Frisch

Pierre Weis wrote:

> If we stick to the row example of positive values, we get:

>

> - a value of type row is in fact a concrete integer (it is not hidden in any

> way),

> - a value of type row can only be created by the make function defined in the

> implementation of the module that defines the private type,

> - a value of type row can be projected out of type row to a value of type int

> with a ``no-op'' identity function (I called it from in the example).

>

> So, no: a value of type row is not of type int and you need a marker to

> indicate the projection (for the time being the marker is a (identity)

> function call to let the implemention as simple as possible, but a sub-typing

> constraint makes sense and we can provide it if this is considered clearer).

>

> Best regards,

>

> -- Pierre

> If we stick to the row example of positive values, we get:

>

> - a value of type row is in fact a concrete integer (it is not hidden in any

> way),

> - a value of type row can only be created by the make function defined in the

> implementation of the module that defines the private type,

> - a value of type row can be projected out of type row to a value of type int

> with a ``no-op'' identity function (I called it from in the example).

>

> So, no: a value of type row is not of type int and you need a marker to

> indicate the projection (for the time being the marker is a (identity)

> function call to let the implemention as simple as possible, but a sub-typing

> constraint makes sense and we can provide it if this is considered clearer).

>

> Best regards,

>

> -- Pierre

Other than piggybacking on the private row types work by J. Garrigue and

probably simplicity of implementation, does there exist a reason for

involving the module system in this mechanism?

At the basic level, creation and projection will always be identity

functions (except that creation can throw exceptions), so it seems

reasonable to elide that repetitive code. For more complex examples (a

private type with more functions that work on the internal

representation of that type), since the representation is already

exposed, it seems appropriate to just let the client implement those

functions through the identity accessors. It'd probably be safer than

doing so in the Row module, because the compiler could enforce the

creation invariant in those functions instead of allowing them to bypass

the 'make' function.

Once all these functions are removed from the module all that's left is

the type definition and the creation constraint. Am I missing something?

E.

Nov 14, 2007, 4:04:36 PM11/14/07

to Edgar Friendly, caml-list, Pierre Weis

> Pierre Weis wrote:

> > In the next version of the compiler, you will have an extension to the

> > definition of types with private construction functions (aka ``private''

> > types) that just fulfills your programming concern: in addition to usual

> > private sum and record private type definitions, you can now define a type

> > abbreviation which is private to the implementation part of a module (see

> > note 1).

> >

> At first glance, this improvement does satisfy my concerns, but I think

> it still falls short of optimal in ways that seem easy to fix.

>

> > Since the values of a private type are concrete, it is much easier for the

> > programmer to use the values of the type abbreviation.

> >

> This concerns me - I'd like to require explicit casting/coercion to

> create or use the internal value of the abstract type, maybe with an

> exception for literals. Could you elaborate on this?

> > In the next version of the compiler, you will have an extension to the

> > definition of types with private construction functions (aka ``private''

> > types) that just fulfills your programming concern: in addition to usual

> > private sum and record private type definitions, you can now define a type

> > abbreviation which is private to the implementation part of a module (see

> > note 1).

> >

> At first glance, this improvement does satisfy my concerns, but I think

> it still falls short of optimal in ways that seem easy to fix.

>

> > Since the values of a private type are concrete, it is much easier for the

> > programmer to use the values of the type abbreviation.

> >

> This concerns me - I'd like to require explicit casting/coercion to

> create or use the internal value of the abstract type, maybe with an

> exception for literals. Could you elaborate on this?

Values of a private type abbreviation are concrete in the sense that they are

public and not hidden to inspection. However, a value of the private type

abbreviation has a type that is the one of the private type, not the one f

the abbreviated expression. To project such a value you need some syntactic

construct: either a (identity) function call or a sub-typing constraint as

already proposed on this list.

To create a value of the private type abbreviation, you must use the

construction function (as usual with private types, you cannot freely crete

values, you must use the provided injection).

> > This solution does not require any additional tagging or boxing, only the

> > usage of injection/projection function for the type. As for usual private

> > types, the injection function is handy to provide useful invariants (in the

> > row type example case, we get ``a row value is always a positive integer'').

> >

> With a bit of low-level support, I imagine it not difficult to implement

> the following:

>

> type row = private int constraint (fun i -> i >= 0)

>

> such that the compiler uses the provided constraint function to check

> any (x :> row) casts, throwing an exception(?) on false. This solution

> wouldn't involve the module system just to have positive integer types,

> and gets rid of the function call overhead on 'from'.

The function call overhead can be avoided easily, if the from function is

provided by the compiler or if we use a sub-typing constraint row :> int.

On the other hand, the construction you proposed also applies to abstract

types: we need module to define an abstract type; we can have something

lighter such as a direct abstract type definition:

type rat = abstract {

numerator : int;

denominator : int;

} with

let make_rat n d = ...

and numerator {numerator = n} = n

and denominator ...

let plus_rat r1 r2 = ...

let mult_rat r1 r2 = ...

...

> > In addition, the private abbreviation is publicly exposed in the interface

> > file. Hence, the compiler knows about it: it can (and effectively does)

> > optimize the programs that use values of type row in the same way as if the

> > type row were defined as a regular public abbreviation.

> >

> Does this mean I can do:

>

> let one (r:row) = r+1

> let onea (r : row) = r = 1

> let two (r:row) (i:int) = r+i

> let three : row -> string = function 3 -> "three" | _ -> "not three"

No, you must explicitely project row values to int values (even if this is an

identity):

let one r = from r + 1

let onea r = from r = 1

let two (r:row) (i:int) = from r + i

let three : row -> string =

fun r -> match from r with

| 3 -> "three" | _ -> "not three"

> > Last but not least, being an instance of the identity function, the from

> > projection function is somewhat generic: we can dream to suppress the burden

> > of having to write it for each private type definition, if we find a way to

> > have it automatically associated to the new private type by the compiler.

> >

> The use case for this feature is restricted construction to enforce

> invariants, no?

Yes.

> Anything more complex should probably be totally

> abstract and not simply private. In which case, the constraint keyword

> seems an effective way to tell the compiler what to do, and just let :>

> coercions do their magic.

Given that the injection function (the make function or the constraint part

of your construction) must enforce arbitrarily complex invariants, we may

need a module for private abbreviation as well (imagine for instance a

private abbreviation for prime numbers, that only injects into the prime type

integer arguments that are indeed prime numbers: you may need some room to

define the predicate!).

Thank you for your suggestions.

Best regards,

--

Pierre Weis

INRIA Rocquencourt, http://bat8.inria.fr/~weis/

_______________________________________________

Nov 14, 2007, 5:10:10 PM11/14/07

to Pierre Weis, caml-list

Pierre Weis wrote:

> Values of a private type abbreviation are concrete in the sense that they are

> public and not hidden to inspection.

>

<SNIP>> Values of a private type abbreviation are concrete in the sense that they are

> public and not hidden to inspection.

>

> No, you must explicitely project row values to int values (even if

> this is an identity):

I guess I don't understand what you mean by "public and not hidden to

inspection". To a programmer using such a type, the values don't seem

public. Do you only mean that to the compiler sees the full type (for

optimization purposes), or something more?

> The function call overhead can be avoided easily, if the from function is

> provided by the compiler or if we use a sub-typing constraint row :> int.

>

I'd love to use subtyping constraints more than functions. Applying a

function *could* do some other work or massage the value it gives to me,

whereas a compiler cast seems to indicate what's happening better. It

also doesn't depend on any cross-module inlining magic that may or may

not happen.

> On the other hand, the construction you proposed also applies to abstract

> types: we need module to define an abstract type; we can have something

> lighter such as a direct abstract type definition:

>

> type rat = abstract {

> numerator : int;

> denominator : int;

> } with

>

> let make_rat n d = ...

> and numerator {numerator = n} = n

> and denominator ...

>

> let plus_rat r1 r2 = ...

> let mult_rat r1 r2 = ...

>

> ...

>

This example seems like a great time to use a module and an abstract

type: there's lots of functions that deal with the data in a way that

they all need to use its internal representation. But there's a use for

private copies of builtin types, possibly with restrictions on their

construction, and it seems that *this*

> Given that the injection function (the make function or the constraint part

> of your construction) must enforce arbitrarily complex invariants, we may

> need a module for private abbreviation as well (imagine for instance a

> private abbreviation for prime numbers, that only injects into the prime type

> integer arguments that are indeed prime numbers: you may need some room to

> define the predicate!).

>

let is_prime i = ... (* return true if prime, false if not *)

type prime = private int constraint is_prime

This seems to suffice for the example of primes. For natural numbers, I

don't see any advantage of having a private type vs. an abstract type,

so using a module to encapsulate that type makes more sense to me.

I might point out that supporting arbitrarily complex invariants, while

theoretically satisfying, might not be necessary. I'd imagine that 90+%

of actual uses of this pattern fall in the case of "simple range

restriction", and taking care of those well changes the approach

significantly. I like the Perl idea of "make the common things easy,

and the hard things possible".

> Thank you for your suggestions.

>

> Best regards,

>

Thank *you* for treating my naive and amateur suggestions as if they had

worth.

All the Best,

Eric

Nov 14, 2007, 7:18:15 PM11/14/07

to thele...@gmail.com, caml...@yquem.inria.fr

From: Edgar Friendly <thele...@gmail.com>

> With a bit of low-level support, I imagine it not difficult to implement

> the following:

>

> type row = private int constraint (fun i -> i >= 0)

>

> such that the compiler uses the provided constraint function to check

> any (x :> row) casts, throwing an exception(?) on false. This solution

> wouldn't involve the module system just to have positive integer types,

> and gets rid of the function call overhead on 'from'.

This syntax could be nice, but it is just syntactic sugar for

module Private_row : sig

type row = private int

val new_row : int -> row

end = struct

type row = int

let new_row i = assert (i >= 0); i

end

include Private_row

I'm sure camlp4 can do that.

Direct compiler support couldn't give you more:

* you cannot use a coercion to create a row: coercions are purely

type-level features, and cannot execute any code; we don't want to

change this. On the other hand coercing row to int could be made ok.

* the "constraint ..." part cannot appear in an interface, since

interfaces cannot contain expressions

Changing any of these two would be difficult indeed.

(To be honest, the above results in

module Private_row : sig type row = private int val new_row : int -> row end

type row = Private_row.row

val new_row : int -> row = <fun>

meaning that Private_row is not completely hidden, eventhough we don't

need to mention it in an interface.)

Jacques Garrigue

Nov 15, 2007, 1:24:02 AM11/15/07

to Jacques Garrigue, caml...@yquem.inria.fr

Jacques Garrigue wrote:

> This syntax could be nice, but it is just syntactic sugar for

>

> module Private_row : sig

> type row = private int

> val new_row : int -> row

> end = struct

> type row = int

> let new_row i = assert (i >= 0); i

> end

> include Private_row

>

Funny to define a module (hiding) just to include it (un-hiding). This> This syntax could be nice, but it is just syntactic sugar for

>

> module Private_row : sig

> type row = private int

> val new_row : int -> row

> end = struct

> type row = int

> let new_row i = assert (i >= 0); i

> end

> include Private_row

>

pair usually comes to a no-op, but here it doesn't.

> I'm sure camlp4 can do that.

>

It can. But camlp4 holds back real development of the OCaml Community

by dividing the language into incompatible splinter dialects or forcing

us to code in pure OCaml which has little sugar.

> Direct compiler support couldn't give you more:

>

> * you cannot use a coercion to create a row: coercions are purely

> type-level features, and cannot execute any code; we don't want to

> change this. On the other hand coercing row to int could be made ok.

>

Good point. And I guess there's no better way to pack the creation

assertion with the type than a module with a constructor function...

> * the "constraint ..." part cannot appear in an interface, since

> interfaces cannot contain expressions

>

Having interfaces contain arbitrary expressions... difficult. But I

guess that we would have a 90% solution by reducing the problem from

arbitrary constraints to simple range constraints. As long as it worked

for ints, floats and chars, .... Still not trivial, though.

Kila la heri,

E.

Nov 15, 2007, 1:33:19 AM11/15/07

to Pierre Weis, caml-list

Pierre Weis wrote:

> - a value of type row is in fact a concrete integer (it is not hidden in any

> way),

> - a value of type row is in fact a concrete integer (it is not hidden in any

> way),

But you cannot apply integer operations to it, so it is not a normal

concrete integer, right?

> - a value of type row can only be created by the make function defined in the

> implementation of the module that defines the private type,

> - a value of type row can be projected out of type row to a value of type int

> with a ``no-op'' identity function (I called it from in the example).

>

> So, no: a value of type row is not of type int and you need a marker to

> indicate the projection (for the time being the marker is a (identity)

> function call to let the implemention as simple as possible, but a sub-typing

> constraint makes sense and we can provide it if this is considered clearer).

Now I'm lost :-)

Can you show an example where replacing all "type t = private ..."

definitions by "type t" changes a well-typed program into an ill-typed

one? I understand that if private types create real subtypes (w.r.t.

:>) then they are different than abstract types, but otherwise, I don't

see the difference for the type-checker.

-- Alain

Nov 15, 2007, 5:55:50 AM11/15/07

to Edgar Friendly, Jacques Garrigue, caml...@yquem.inria.fr

On Thu, Nov 15, 2007 at 12:23:25AM -0600, Edgar Friendly wrote:

> > I'm sure camlp4 can do that.

> >

> It can. But camlp4 holds back real development of the OCaml Community

> by dividing the language into incompatible splinter dialects or forcing

> us to code in pure OCaml which has little sugar.

> > I'm sure camlp4 can do that.

> >

> It can. But camlp4 holds back real development of the OCaml Community

> by dividing the language into incompatible splinter dialects or forcing

> us to code in pure OCaml which has little sugar.

I'm glad that some people feels the same way about camlp4 !

this is really sad, that people thinks camlp4 is a great idea ...

--

Vincent Hanquez

Nov 15, 2007, 8:27:21 AM11/15/07

to Alain Frisch, caml-list, Pierre Weis

> >- a value of type row is in fact a concrete integer (it is not hidden in

> >any way),

>

> But you cannot apply integer operations to it, so it is not a normal

> concrete integer, right?

> >any way),

>

> But you cannot apply integer operations to it, so it is not a normal

> concrete integer, right?

Right: a value of type row has type row ... not type int!

> Can you show an example where replacing all "type t = private ..."

> definitions by "type t" changes a well-typed program into an ill-typed

> one? I understand that if private types create real subtypes (w.r.t.

> :>) then they are different than abstract types, but otherwise, I don't

> see the difference for the type-checker.

May be, I must recall what are private types in the first place: private

types has been designed to implement quotient data structure.

(*** Warning.

Wao: after re-reading this message I realize that it is really long.

You can skip it, if you already know something about mathematical quotient

structures, private types for record and variant, relational types and the

mocac compiler!

***)

What is a quotient ?

--------------------

Here quotient has to be understood in the mathematical sense of the term:

given a set S and an equivalence relation R, you consider the set S/R of the

equivalence classes modulo the relation R. S/R is named the quotient

structure of S by R. Quotient structures are fundamental in mathematics and

there properties have been largely studied, in particular to find the

relationship between operations defined on S and operations on S/R: which

operations on S can be extended to operations on S/R ? Which properties of

operations on S are still valid for there extension on S/R ? and so on.

Simple examples of quotient structures can be found everywhere in maths, for

instance consider the equivalence relation R on pairs of integer values

defined as

z_1 equiv z_2 if and only if z_1 and z_2 are both odd or are both even

(in Caml terms

z_1 equiv z_2 if and only if (z_1 mod 2) = (z_2 mod 2))

The set Z/R is named Z/2Z and it inherits properties of operations on Z: it

is an abelian group (and more).

A wonderful example of such inheritance of interesting properties by

inheritance of operations thanks to a definition by a quotient structure is

the hierarchy of sets of numbers: starting with N (the set of natural

numbers) we define Z (the set of relative integer numbers) as a quotient of

NxN), then Q (the rational numbers) as a quotient of ZxZ*, R (the set of real

numbers) as a quotient of Q^N (the sets of sequences of rational numbers), C

(the set of complex numbers) as a quotient of R[X] (the set of polynomials

with one unknown and real coefficients). Note here that at each step of the

hierarchy the quotient structure is richer (has more properties and/or more

elements) than the original structure: first we have a monoide, then a group

and a ring, then a field, then a complete field and so on.

Why quotient structures ?

-------------------------

So quotient structures are a fundamental tool to express mathematical

structures and properties. Exactly as mathematical functions, relations and

sets. As you may have noticed, programming languages are extremely related to

maths (due to their purely logical bases) even if, in some cases, the zealots

of the language at hand try to hide the mathematical fundation into a strange

anthropomorphic discurse to describe their favorite language features.

In the end, the computer programing languages try hard to incorporate

powerful features from mathematics, because these features have been polished

by mathematicians for centuries. As a consequence of this work, we know

facts, properties and theorems about the mathematical features and this is an

extremely valuable benefit.

Now, what is the next frontier ? What can we steal more to mathematics for the

benefit of our favorite programming language ?

If we review the most powerful tools of mathematicians, we found that

functions have been borrowed (hello functional programming, lambda-calculus

and friends :)), relations have been borrowed (data bases, hash tables,

association lists), sets have been more or less borrowed (hello setle,

pascal, and set definition facilities from various libraries of various

languages...). More or less, we have all those basic features.

>From the mathematical set construction tools, we have got in Caml:

- the cartesian product of sets (the * binary type constructor, the record

type definitions),

- the disjunct union of sets (the | of polymorphic variants, the sum (or

variant) type definitions).

Unfortunately, we have no powerful way to define a quotient data

structure. That what private type definitions have been designed to do.

What do we need for a quotient data structure ?

-----------------------------------------------

In the first place, we need the ``true'' thing, the real feature that is

indeed used in maths. Roughly speaking this means to assimilate the quotient

set S/R to a subset of S.

In the previous definition of quotient structures, there is a careful

distinction between the base set S and the quotient set S/R. In fact, there

always exists a canonical injection from S to S/R, and if we choose a

canonical representant in each equivalence class of S/R, we get another

canonical injection from S/R to S, so that S/R can be considered as a subset

of S (the story is a bit more complex but that's the idea). This

injection/projection trickery is intensively used in maths; for instance, in

the hierarchy of number sets, we say and consider that N is a subset of Z

that itself is a subset of Q, a subset of R, a subset of C. Rigourously, we

should say for instance, there is a subset of Z that is canonically

isomorphic to N; in fact, we abusively assimilate N to this subset of Z;

hence, we say that N is ``included'' in Z, when we should say ``the image of

N by the canonical isomorphism from N to Z'' is included in Z; then, we go

one step further in the assimilation of N to a subset of Z, by denoting the

same, the elements of N and there image in Z; for instance, ``the neutral

element of the multiplication in Z'' and the successor of 0 in N is denoted

``1''; and we now can say that 1 belongs to Z. Note here that, in the first

place, ``the neutral element of the multiplication in Z'' is an equivalence

class (as all elements in Z are). So it is a set. More exactly, the ``neutral

element of the multiplication in Z'' is the infinite set of pairs of natural

numbers (n, m) such that n - m = 1 (here ``-'' is an operation in N and ``1''

is the successor of the natural number ``0'', so that n - m = 1 is a

shorthand for n = succ m). The assimilation between N and its isomorphic

image allows to use 1 as the denotation of this infinite set of pairs of

natural numbers.

We understand why the mathematicians always write after having designed a

quotient structure: ``thanks to this isomorphism, and if no confusion may

arise, we always assimilate S to its canonical injection in S/R''.

This assimilation is what private type definitions allow.

How do we define a quotient data structure ?

--------------------------------------------

The mathematical problem:

- we have a set S and an equivalence relation R on SxS,

- we construct the quotient S/R.

- we state afterwards:

``if no confusion may arise, we always assimilate S to its canonical

injection in S/R''.

The programming problem:

- we have a data structure S (defined by a type s) and a relation R on SxS

(defined by a function r from s * s to bool).

- we construct a data structure that represents S/R.

- we have afterwards:

``No confusion may arise, we always assimilate S to its canonical injection

in S/R''.

The private data type solution:

- the data structure S is defined as any Caml data structure and the

relation R is implemented by the canonical injection(s) from S to S/R.

- the canonical projection from S/R to S is automatic.

- S (defined by s) is assimilated to S/R (which is then also s!).

We defined S/R as a subset of S, the set of canonical representants of

equivalence classes of S/R.

More exactly, the canonical injection from S to S/R maps each element of S to

its equivalent class in S/R; if we assimilate each equivalence class to its

canonical representant (an element of S), the canonical injection maps each

element of S to the canonical representant of its equivalence class. Hence

the canonical injection has type S -> S.

An example treated without private types:

-----------------------------------------

Let's take a simple example:

S is the following data structure that implements a mathematical (free) structure

generated by the constant 0, the unary symbol succ, and the binary symbol +.

type peano =

| Zero

| Succ of peano

| Plus of peano * peano

R is the (equivalence) relation that expresses that

- 0 is the neutral element for + (for all x in S, 0 + x = x and x + 0 = x),

- + is associative (for all x, y, z in Sł, (x + (y + z)) = ((x + y) + z)).

The canonical injection is a function from peano -> peano that maps each value

in S (the type peano) to the canonical representant of its class in S/R (an

element of S (the type peano)):

let rec make = function

| Zero -> Zero

| Plus (Succ n, p) -> Succ (make (Plus (n, p)))

| Plus (Zero, p) -> p

| Plus (p, Zero) -> p

| Plus (Plus (x, y), z) -> make (Plus (x, make (Plus (y, z))))

| Succ p -> Succ p

;;

(This function may be wrong but you see the idea :))

So, if you want to represent S/R for peano in Caml you must:

- (1) define the type peano

- (2) always use the make function to create a value in S/R

- (3) not to confuse S and S/R in your head (I mean in your programs)

Private data types permits (1), ensures (2), and helps for (3) concerning the

head part and ensures (3) for programs by means of compile-time type errors.

The same example with private types:

------------------------------------

To define a private data type you must define a module.

- in the implementation, you define the carrier S and its canonical injection.

- in the interface, you export the carrier and the injection.

The type checker will ensure transparent projection from the quotient to the

carrier and mandatory use of the canonical projection to build a value in

S/R.

interface peano.mli

-------------------

type peano = private

| Zero

| Succ of peano

| Plus of peano * peano

;;

val zero : peano;;

val succ : peano -> peano;;

val plus : peano * peano -> peano;;

implementation peano.ml

-----------------------

type peano =

| Zero

| Succ of peano

| Plus of peano * peano

;;

let rec make = function

...

;;

let zero = make Zero;;

let succ p = make (Succ p);;

let plus (n, m) = make (Plus (n, m));;

(See note (1) for a discussion on the design of this example.)

What is given by private types:

-------------------------------

- You cannot create a value of S/R if you do not use the canonical injection

# Zero;;

Cannot create values of the private type peano

- As a consequence, values in S (i.e. S/R) are always canonical:

# let one = succ zero;;

val one : M.peano = Succ Zero

# let three = plus (one, succ (plus (one, zero)));;

val three : M.peano = Succ (Succ (Succ Zero))

- Projection is automatic

# let rec int_of_peano = function

| Zero -> 0

| Succ n -> 1 + int_of_peano n

| Plus (n, p) -> int_of_peano n + int_of_peano p

;;

val int_of_peano : M.peano -> int = <fun>

# int_of_peano three;;

- : int = 3

What about private abbreviations ?

----------------------------------

Private abbreviation definitions are a natural extension of private data type

definitions to abbreviation type definitions. The same notions are carried

out mutatis-mutandis:

- we have a data structure S (defined by a type s) and a relation R on SxS

(defined by a function r from s * s to bool).

- we construct a data structure that represents S/R.

- we have afterwards:

``No confusion may arise, we always assimilate S to its canonical injection

in S/R''.

In the case of abbreviations:

- the data structure S/R is defined by a type s (which is an abbreviation) and

a relation defined by a function,

- the canonical injection should be defined in the implementation file of the

private data type module,

- we always assimilate S to its canonical injection in S/R.

In pratice, as for usual private types (for which the constructive part of

the data type is not available outside the implementation), the type abbreviation

is not known outside the implementation (it is really private to its

implementation). This prevents the construction of values of S/R without

using the canonical injection.

Th noticeable difference is the projection function: it cannot be fully

implicit (otherwise, as you said Alain, the assimilation will turn to complete

confusion: we would have S identical to S/R, hence row=int and no difference

between a regular abbreviation definition and a private abbreviation

definition). If not implicit, the injection function should granted to be the

identity function (something that we would get for free, if we allow

projection via sub-typing coercion).

In short: abstract data types provide values that cannot be inspected nor

safely manipulated without using the functions provided by the module that

defines the abstract data type. In contrast, private data types are

explicitely concrete and you can freely write any algorithm you need. A good

test is printing: you can always write a function to print values of a

private type, it is up to the implementor of an abstract type to give you the

necessary primitives to access the components of the abstracted values.

Automatic generation of the canonical injection:

------------------------------------------------

You may have realized that writing the canonical injection can be a real

challenge.

The moca compiler (see http://moca.inria.fr/) helps you to write the

canonical injection by generating one for you, provided you can express the

injection at hand via a set of predefined algebraic relations (and/or rewrite

rules you specify) attached to the constructors of the private type. Private

types with constructors having algebraic relations are named relational

types. Moca generated canonical injections for relation types.

For instance, you would write the peano example as the following peano.mlm

file:

type peano = private

| Zero

| Succ of peano

| Plus of peano * peano

begin

associative

neutral (Zero)

rule Plus (Succ n, p) -> Succ (Plus (n, p))

end;;

The mocac compiler will generate the interface and implementation of the

peano module for you, including the necessary canonical injection function.

Best regards,

--

Pierre Weis

INRIA Rocquencourt, http://bat8.inria.fr/~weis/

Note (1):

- we can't just export the canonical injection since you could not build any

value of the type without previously defined values!

- we provide specialized versions of the canonical injection function

introducing the convention that the lowercase name of a value constructor is

the name of its associated canonical injection.

- we do not export the plasin true canonical injection since it would be more

confusing than useful.

Nov 15, 2007, 8:49:25 AM11/15/07

to Vincent Hanquez, caml...@yquem.inria.fr

Vincent Hanquez wrote:

> On Thu, Nov 15, 2007 at 12:23:25AM -0600, Edgar Friendly wrote:

>

>> But camlp4 holds back real development of the OCaml Community

>> by dividing the language into incompatible splinter dialects or forcing

>> us to code in pure OCaml which has little sugar.

>>

>

> I'm glad that some people feels the same way about camlp4 !

> this is really sad, that people thinks camlp4 is a great idea ...

>

> On Thu, Nov 15, 2007 at 12:23:25AM -0600, Edgar Friendly wrote:

>

>> But camlp4 holds back real development of the OCaml Community

>> by dividing the language into incompatible splinter dialects or forcing

>> us to code in pure OCaml which has little sugar.

>>

>

> I'm glad that some people feels the same way about camlp4 !

> this is really sad, that people thinks camlp4 is a great idea ...

>

Come now, be fair, camlp4 has both advantages and disadvantages.

Advantages:

- Allows you to *prove* that some syntactic sugar can make a big

difference in writing code. This is 2 orders of magnitude more

effective in convincing people than just arguing for it (on caml-list,

in person, whatever). Some extensions become popular because they have

proved themselves this way. What is lacking is a process by which these

can be integrated into the language.

- Allows you to show that some boilerplate code can be eliminated. This

is even better than the above, because every such instance is really a

research problem in disguise: how does one design a type system that can

show that this purely syntactic manipulation is correct?

- Allows you to embed a DSL into OCaml with quite a bit of ease and

freedom. One of the reasons to use campl4 instead of OCaml directly is

that for embedding DSLs, it is very frequent that lazyness is needed,

something less comfortable in OCaml.

Disadvantages:

- It splinters the language into dialects.

- It makes errors in the source code of user programs difficult to track

[anyone who has ever debugged camlp4-enabled programs have seen this]

- Bugs in the extension itself are fantastically difficult to debug.

Personally, I think the best situation would be if camlp4 were

unnecessary. But there is a lot of PL research to be done before that's

possible. And in the meantime, the extensions that people choose to

write in camlp4 communicate very loudly what users of OCaml really want

[as they are willing to put in serious effort into ``adapting'' the

language to their needs]; when I used to be a language developer (on

another system and, sadly, before I got proper training to do so), I

weighted user requests by how much effort the user had put into the

request. This annoyed loudmouths and pleased hardcore users, which

seemed fine with me.

Jacques

Nov 15, 2007, 9:52:48 AM11/15/07

to caml...@yquem.inria.fr

On Thursday 15 November 2007 13:48, Jacques Carette wrote:

> Personally, I think the best situation would be if camlp4 were

> unnecessary. But there is a lot of PL research to be done before that's

> possible...> Personally, I think the best situation would be if camlp4 were

> unnecessary. But there is a lot of PL research to be done before that's

How much more PL research do we need to tell us that OCaml is crying out for

a "try .. finally" construct?

> - It splinters the language into dialects.

Fork the OCaml distribution instead of using camlp4.

There are now thousands of OCaml programmers dying for trivial additions like

this, many of whom would contribute if they could and a single forked dialect

would improve in practical terms much faster than the current OCaml is.

--

Dr Jon D Harrop, Flying Frog Consultancy Ltd.

http://www.ffconsultancy.com/products/?e

_______________________________________________

Nov 15, 2007, 11:55:38 AM11/15/07

to Jon Harrop, caml...@yquem.inria.fr

On Thu, 15 Nov 2007, Jon Harrop wrote:

> On Thursday 15 November 2007 13:48, Jacques Carette wrote:

>> Personally, I think the best situation would be if camlp4 were

>> unnecessary. But there is a lot of PL research to be done before that's

>> possible...

>

> How much more PL research do we need to tell us that OCaml is crying out for

> a "try .. finally" construct?

>

>> - It splinters the language into dialects.

>

> Fork the OCaml distribution instead of using camlp4.

>

> There are now thousands of OCaml programmers dying for trivial additions like

> this,

As far as I know, nobody has died yet.

> many of whom would contribute if they could and a single forked dialect

> would improve in practical terms much faster than the current OCaml is.

That's only a theory. You sound like OCaml is the worst language on the

planet, while many people here think that it's the best.

Adding gazillions of extensions to the core language could just kill it.

I'm sure you know that.

Back to work...

Martin

_______________________________________________

Nov 15, 2007, 12:30:51 PM11/15/07

to Pierre Weis, caml-list, Alain Frisch

Okay, let's see if I can summarize:

Private types have use because you can expose your implementation while

still having control over construction of values. This is important for

implementing quotient structures.

After reading everything about quotient types and the need for private

types, I have to ask "why not just completely abstract the type"? What

you seem to want from private types, you seem to gain pretty easily

through abstract types.

E.

Nov 15, 2007, 3:29:29 PM11/15/07

to Edgar Friendly, caml-list, Pierre Weis, Alain Frisch

The main problem I have with abstract types is that they are heavyweight

since they need to be defined inside modules. In that particular, the

proposed private types are also heavyweight.

I would like to have private types be some kind of lightweight abstract

types, as illustrated in the following example with two different

injections and two different projections between integers and even

numbers. Note that no modules are needed. The only way to get into or out

of a private type would be by explicit coercion.

Private types defined this way would be sound, as expressions like

"half 10" would be forbidden by the compiler. The right expression

would be "half (10 :> even)". A programmer may break the semantics of

"even" by writing things such as "(11 :> even)", but he better know

what he is doing. The current abstract types can be used to make a private

type abstract too if that safety is needed.

type even = private int

let half (x:even) = (x :> int)/2

val half : even -> int

let int_of_even (x:even) = (x :> int)

val int_of_even : even -> int

let dbl (x:int) = (2*x :> even)

val dbl : int -> even

let even_of_int (x:int) =

if x mod 2 = 0 then (x :> even)

else invalid_arg "even_of_int"

val even_of_int : int -> even

There is another less natural way to encode even numbers that enforces

automatically the semantics: even number 2*n is encoded as integer n.

Then, the example becomes:

type even = private int

let half (x:even) = (x :> int)

let int_of_even (x:even) = 2*(x :> int)

let dbl (x:int) = (x :> even)

let even_of_int (x:int) =

let e = x/2 in

if 2*e = x then (e :> even)

else invalid_arg "even_of_int"

However, this kind of semantics-preserving encoding is just a matter of good

luck, and I don't think it can be generalized to many cases, at least

statically. You could always have a run-time check, but that misses the

point. Although, I don't see a way to avoid the run-time check in the

function even_of_int...

In summary, in my opinion private types should be completely orthogonal

to abstract types, so that both can be used either together or

separately.

Thanks,

Fernando

Nov 15, 2007, 5:19:30 PM11/15/07

to Pierre Weis, caml-list, Alain Frisch

This long article is very interesting and pedagogic to the novice

programmer I am. The `int_of_peano' example is quite enlightening for

me. However, I must confess I was confused by the use of

`injection'/`projection', especially since the `canonical injection

S -> S/R' is usually not injective! The name `canonical surjection'

would suit it better, would'nt it? I am curious to know why this

strange terminology is used.

--

Regards,

Michaël LB

programmer I am. The `int_of_peano' example is quite enlightening for

me. However, I must confess I was confused by the use of

`injection'/`projection', especially since the `canonical injection

S -> S/R' is usually not injective! The name `canonical surjection'

would suit it better, would'nt it? I am curious to know why this

strange terminology is used.

--

Regards,

Michaël LB

Nov 15, 2007, 5:33:03 PM11/15/07

to Edgar Friendly, caml-list, Pierre Weis, Alain Frisch

Edgar Friendly <thele...@gmail.com> writes:

> After reading everything about quotient types and the need for private

> types, I have to ask "why not just completely abstract the type"? What

> you seem to want from private types, you seem to gain pretty easily

> through abstract types.

I think you have overlooked the following example in Pierre's article:

- Projection is automatic

# let rec int_of_peano = function

| Zero -> 0

| Succ n -> 1 + int_of_peano n

| Plus (n, p) -> int_of_peano n + int_of_peano p

;;

val int_of_peano : M.peano -> int = <fun>

# int_of_peano three;;

- : int = 3

Since we have access to the representation of the value we can filter

values (just like in the example) and since we are unable to create

ill-shaped values, we can assume that good invariants are true in the

values we are examining.

If we want to do the same with abstract data types, we would have

two types involved instead of one:

* the abstract type peano already described;

* a second type `good_representation' and an explicit application

`projection: peano -> good_representation'

Let's say that sets manipulated through the `Set' module are

represented by balanced trees. If the type set had been private

instead of abstract, we could investigate the representation of a

value of the type but could not produce an ill-formed value (an

`unbalenced tree').

(I also may be totally wrong :) )

--

Cheers,

Michaël

Nov 15, 2007, 7:30:59 PM11/15/07

to Pierre Weis, caml-list, Alain Frisch

Most of what you said about quotient types made sense to me, but I must

admit to being deeply confused about the nature of the change to the

language. Consider the following trivial module and two possible

interfaces.

admit to being deeply confused about the nature of the change to the

language. Consider the following trivial module and two possible

interfaces.

module Int = struct

type t = int

let of_int x = x

let to_int x = x

end

module type Abs_int = sig

type t

val of_int : int -> t

val to_int : t -> int

end

module type Priv_int = sig

type t = private int

val of_int : int -> t

val to_int : t -> int

end

So, what is the difference between (Int : Abs_int) and (Int : Priv_int)?

For instance, can I write (Priv_int.of_int 3) + (Priv_int.of_int 5)? Can

you point to anything concrete I can do with the private version that I

can't do with the abstract version? Is there any expression that is legal

for one but not for the other?

y

> INRIA Rocquencourt, http://bat8.inria.fr/~weis/<http://bat8.inria.fr/%7Eweis/>

Nov 15, 2007, 7:31:55 PM11/15/07

to Fernando Alegre, caml-list

On Thu, 15 Nov 2007, Fernando Alegre wrote:

>

> The main problem I have with abstract types is that they are heavyweight

> since they need to be defined inside modules. In that particular, the

> proposed private types are also heavyweight.

I don't see modules as being that heavyweight. Absent functors, they're

just namespaces. And Ocaml's cross-module inlining eliminates even most

of that overhead- identity conversions are generally inlined and become

no-ops. Even with functors, the overhead is small, approximately that of

calling a virtual function in C++ or Java.

Just my $0.02.

Brian

Nov 15, 2007, 7:47:02 PM11/15/07

to caml...@yquem.inria.fr

Hi,

On Thu, 15 Nov 2007 14:26:49 +0100, Pierre Weis wrote:

>

> we define Z (the set of relative integer numbers) as a quotient of NxN),

I believe Z is generally defined as a quotient of the disjoint union

(borrowed as variants in programming languages) of N and N

(identifying the two 0). Of course you can also define it as a

quotient of N×N by the smaller equivalent relation "~" s.t. (n,m) ~

(n+1,m+1) and it is quite nice as addition is "inherited" from the one

on N×N but I am not sure it is the way it is usually done.

> In the previous definition of quotient structures, there is a

> careful distinction between the base set S and the quotient set

> S/R. In fact, there always exists a canonical injection from S to S/R,

Sorry for the nit-picking (I'm a mathematician, you know how we

are :) ), but unless R is diagonal, the function S -> S/R is not

injective (since we identify R-equivalent elements) but it is

surjective.

> We understand why the mathematicians always write after having designed a

> quotient structure: ``thanks to this isomorphism, and if no confusion may

> arise, we always assimilate S to its canonical injection in S/R''.

By the above remark, S can be identified to a subset of S'/R for some

S' but definitely not (of) S/R.

> (...)

> - the canonical projection from S/R to S is automatic.

> (...)

> More exactly, the canonical injection from S to S/R maps each element of S to

> its equivalent class in S/R; if we assimilate each equivalence class to its

> canonical representant (an element of S), the canonical injection maps each

> element of S to the canonical representant of its equivalence class. Hence

> the canonical injection has type S -> S.

This is rather the projection (and generally not an injection) as,

mathematically, p : S -> S is a projection means p(p x) = p x, forall x.

Moreover, in the abstract, there is no canonical representative of an

equivalent class -- that depends on the situation at hand.

Given your example (deleted), I suppose what you mean is that the

projection S -> S (representing S -> S/R) has to be written (since one

has to actually choose a representative) while the injection S/R -> S

comes for free or up to a coerecion (since we identified S/R to a

subset of S).

> What about private abbreviations ? (...)

> If not implicit, the injection function should granted to be the

> identity function (something that we would get for free, if we allow

> projection via sub-typing coercion).

I second the use of a subtyping coercion for that (after all, it would

be really annoying to have to use a private type abbreviation from a

module "forgetting" to provide a way out :) ).

> The moca compiler (see http://moca.inria.fr/) helps you to write the

> canonical injection by generating one for you, provided you can express the

> injection at hand via a set of predefined algebraic relations (and/or rewrite

May you tell a bit more ? That looks quite interesting ! Since there

is nothing canonical about the injection S/R -> S, what does moca do

if several representatives are possible (e.g. pick one by preferring

left associativity to the right one?). (I saw there is some paper on

the page but I have no time to dig through it right now.)

Best regards,

ChriS

Nov 15, 2007, 8:52:11 PM11/15/07

to Yaron Minsky, caml-list, Pierre Weis, Alain Frisch

On Thu, 15 Nov 2007, Yaron Minsky wrote:

> Most of what you said about quotient types made sense to me, but I must

> admit to being deeply confused about the nature of the change to the

> language. Consider the following trivial module and two possible

> interfaces.

>

> module Int = struct

> type t = int

> let of_int x = x

> let to_int x = x

> end

>

> module type Abs_int = sig

> type t

> val of_int : int -> t

> val to_int : t -> int

> end

>

> module type Priv_int = sig

> type t = private int

> val of_int : int -> t

> val to_int : t -> int

> end

>

> So, what is the difference between (Int : Abs_int) and (Int : Priv_int)?

Just like for other constructors of concrete types, "private" makes them

read-only, i.e. you can use them only in pattern-matching.

You can write

match (x : Priv_int.t) with 0 -> true | _ -> false

But not

(x : Priv_int.t) = 0

> For instance, can I write (Priv_int.of_int 3) + (Priv_int.of_int 5)?

No, Priv_int would have to define its own Priv_int.(+) function.

> Can

> you point to anything concrete I can do with the private version that I

> can't do with the abstract version? Is there any expression that is legal

> for one but not for the other?

You can do some pattern-matching over private versions of ints, strings,

chars, arrays, builtin lists, polymorphic variants, in addition to records

and variants, without the risk of passing such values to the wrong

function (if such wrong function expects a real "int" for instance).

You can't do any useful pattern matching with abstract types.

Martin

_______________________________________________

Nov 16, 2007, 3:23:29 AM11/16/07

to Christophe TROESTLER, Caml

Christophe TROESTLER wrote:

> I believe Z is generally defined as a quotient of the disjoint union

> (borrowed as variants in programming languages) of N and N

> (identifying the two 0). Of course you can also define it as a

> quotient of N×N by the smaller equivalent relation "~" s.t. (n,m) ~

> (n+1,m+1) and it is quite nice as addition is "inherited" from the one

> on N×N but I am not sure it is the way it is usually done.

> I believe Z is generally defined as a quotient of the disjoint union

> (borrowed as variants in programming languages) of N and N

> (identifying the two 0). Of course you can also define it as a

> quotient of N×N by the smaller equivalent relation "~" s.t. (n,m) ~

> (n+1,m+1) and it is quite nice as addition is "inherited" from the one

> on N×N but I am not sure it is the way it is usually done.

As another mathematician, also prone to nitpicking, I beg to differ. It

is more natural to define Z as a quotient of N x N, because this is just

a special case of a general construction of a (commutative) ring out of

a (commutative) rig.

A rig is a structure with 0, + and * but no subtraction.

A ring is a structure with 0, +, - and *.

(Of course you need to imagine suitable axioms for 0, +, * such as

commutativity, associativity and distributivity, but I am not writing

them down here.)

From every rig M we can construct a ring R as a quotient (M x M)/~ of M

x M where we define

(a,b) ~ (c,d) iff a+d = b+c

Think of (a,b) as the "non-existent" difference a-b. This construction

is natural in the technical sense that it is left adjoint to the

forgetful functor(*) from rings to rigs. (In non-mathematician language:

if we already have + and *, there is a _best_ way of getting - as well.)

Caveat: (*) the use of "functor" here was as in mathematics, not as in

ocaml.

In what way is natural, or canonical, the construction of Z defined by

sticking together two copies of N?

To add some value to this post, other than mathematical willy waving,

let me pose a puzzle. Implement the above in Ocaml, i.e., start with:

module type RIG =

sig

type t

val zero : t

val add : t -> t -> t

val mul : t -> t -> t

end

module type RING =

sig

include RIG

val sub : t -> t -> t

end

The puzzle is this: write down the module type for the construction

taking a rig and returning the free ring over it. It is _not_ just

module type ConstructRing = functor (M : RIG) -> RING

because that would allow _any_ ring whatsoever as the result. It has to

be the ring that comes from M via above construction, so there has to be

extra stuff, such as an embedding of M into the resulting R. But what else?

Andrej

Nov 16, 2007, 3:59:03 AM11/16/07

to Andrej...@andrej.com, Christophe TROESTLER, Caml

Andrej Bauer a écrit :

> The puzzle is this: write down the module type for the construction

> taking a rig and returning the free ring over it. It is _not_ just

>

> module type ConstructRing = functor (M : RIG) -> RING

>

> because that would allow _any_ ring whatsoever as the result. It has to

> be the ring that comes from M via above construction, so there has to be

> extra stuff, such as an embedding of M into the resulting R. But what else?

> The puzzle is this: write down the module type for the construction

> taking a rig and returning the free ring over it. It is _not_ just

>

> module type ConstructRing = functor (M : RIG) -> RING

>

> because that would allow _any_ ring whatsoever as the result. It has to

> be the ring that comes from M via above construction, so there has to be

> extra stuff, such as an embedding of M into the resulting R. But what else?

Despite the embedding you're mentioning, which could be added as follows

module type ConstructRing =

functor (M : RIG) -> sig include RING val embed : M.t -> t end

I don't see what else you could add to the signature (unless you want to

expose that the type t of the RING is a pair of type M.t * M.t; and even

with that you couldn't expose the definitions of the operations in the

signature).

All the properties you're expecting for the resulting ring are part of

(logical) specifications, which cannot be expressed via ocaml types.

You could do that in a richer type system (e.g. Coq).

But may be I misunderstood your question...

--

Jean-Christophe

Nov 16, 2007, 4:13:35 AM11/16/07

to Jean-Christophe Filliâtre, Caml

Jean-Christophe Filliâtre wrote:

> Despite the embedding you're mentioning, which could be added as follows

>

> module type ConstructRing =

> functor (M : RIG) -> sig include RING val embed : M.t -> t end

>

> I don't see what else you could add to the signature (unless you want to

> expose that the type t of the RING is a pair of type M.t * M.t; and even

> with that you couldn't expose the definitions of the operations in the

> signature).

>

> All the properties you're expecting for the resulting ring are part of

> (logical) specifications, which cannot be expressed via ocaml types.

> You could do that in a richer type system (e.g. Coq).

> Despite the embedding you're mentioning, which could be added as follows

>

> module type ConstructRing =

> functor (M : RIG) -> sig include RING val embed : M.t -> t end

>

> I don't see what else you could add to the signature (unless you want to

> expose that the type t of the RING is a pair of type M.t * M.t; and even

> with that you couldn't expose the definitions of the operations in the

> signature).

>

> All the properties you're expecting for the resulting ring are part of

> (logical) specifications, which cannot be expressed via ocaml types.

> You could do that in a richer type system (e.g. Coq).

Not quite. The universal property of the resulting ring R is this:

for any ring T, if f : M -> T is a function preserving 0, + and * then

there is a unique function g : R -> T preserving 0, +, * and - such that

g (include x) = f x, where include is as above. This needs to be

accounted for, and it's not just a logical specification.

Andrej

Nov 16, 2007, 4:24:28 AM11/16/07

to caml-list

Martin Jambon wrote:

> You can write

> match (x : Priv_int.t) with 0 -> true | _ -> false

> You can write

> match (x : Priv_int.t) with 0 -> true | _ -> false

Actually, you cannot do that, at least with private types as implemented

in OCaml's CVS. And this is to be expected given the lack of implicit

subsumption in OCaml. If you were able to do such a thing, what type

schema would you give to:

let f = function 0 -> true | _ -> false

?

This function should work on integers as well as on values of type

Priv_int.t, but the type algebra cannot express that.

My understanding (Pierre will correct me if I'm wrong) is that private

type abbreviations, as they are currently implemented, can always be

replaced by abstract types without turning a well-typed program into an

ill-typed one. Doing so will prevent some type-based optimizations to

happen, though.

But the really interesting thing is that the new feature opens the door

to extending the subtyping operator :> to take into account the natural

(identity) injection from a private abbreviation to the underlying type.

This is especially useful when the value of the private type appears

deeply nested in a structure. With a normal function that implements the

injection, you need to lift it to the whole structure which forces

useless copies (and worse: the manual lifting may not be possible if

some module declares a covariant type without a corresponding map

function).

For instance:

(l : (Priv_int.t * Priv_int.t) list :> (int * int) list)

instead of

List.map (fun (x, y) -> (Priv_int.to_int x, Priv_int.to_int y)) l

-- Alain

Nov 16, 2007, 4:49:35 AM11/16/07

to OCaml Mailing List, Andrej...@fmf.uni-lj.si, Andrej...@andrej.com

On Fri, 16 Nov 2007 09:23:28 +0100, Andrej Bauer wrote:

>

> This construction is natural in the technical sense that it is left

> adjoint to the forgetful functor(*) from rings to rigs.

>

> This construction is natural in the technical sense that it is left

> adjoint to the forgetful functor(*) from rings to rigs.

OK.

> In what way is natural, or canonical, the construction of Z defined

> by sticking together two copies of N?

I do not think there is something natural about this construction in

the way you outlined it above. It's just ad hoc for this case.

> The universal property of the resulting ring R is this: for any ring

> T, if f : M -> T is a function preserving 0, + and * then there is a

> unique function g : R -> T preserving 0, +, * and - such that g

> (include x) = f x, where include is as above. This needs to be

> accounted for, and it's not just a logical specification.

Would this fit your bill?

module type To_Ring = functor (M: RIG) -> sig

include RING

val embed : M.t -> t

module F : functor (T:RING) -> sig

val lift : (M.t -> T.t) -> (t -> T.t)

end

end

Regards,

ChriS

Nov 16, 2007, 9:17:38 AM11/16/07

to caml-list

"Alain Frisch" <al...@frisch.fr>:

>

> But the really interesting thing is that the new feature opens the door

> to extending the subtyping operator :> to take into account the natural

> (identity) injection from a private abbreviation to the underlying type.

>

> But the really interesting thing is that the new feature opens the door

> to extending the subtyping operator :> to take into account the natural

> (identity) injection from a private abbreviation to the underlying type.

Indeed. The first thing I was wondering when reading Pierre's description

was how that makes

type t = private int

any different from a bounded abstraction

type t <: int

With OCaml's explicit subtyping coercions, the former seems to be subsumed

by the latter.

- Andreas