19 views

Skip to first unread message

Dec 4, 2010, 2:25:09 PM12/4/10

to caml-list caml-list

Dear caml-list,

I would like to start a constructive discussion on the syntax of GADT

constructors of the ocaml gadt branch, which can be found at:

I would like to start a constructive discussion on the syntax of GADT

constructors of the ocaml gadt branch, which can be found at:

https://sites.google.com/site/ocamlgadt/

There are two separate issues:

1) general constructor form

option a)

type _ t =

TrueLit : bool t

| IntLit of int : int lit

option b)

type _ t =

TrueLit : bool t

| IntLit : int -> int lit

I'm open to other options. The branch has used option b) from the

start, but I've just switched to option a) to see what it's like

Personal opinion:

I slightly prefer option b), because it makes it clear that it's a

gadt constructor right from the start. This is useful because the type

variables in gadt constructors are independent of the type parameters

of the type, consider:

type 'a t = Foo of 'a : 'b t

this, counter intuitively, creates a constructor Foo of type forall 'd

'e. 'd t -> 'e t.

2) explicit quantification of existential variables

option a)

leave existential variables implicitly quantified. For example:

type _ u = Bar of 'a t : u

or

type _ u = Bar : 'a t -> u

option b)

specifically quantify existential variables. For example:

type _ u = Bar of 'a. 'a t : u

or

type _ u = Bar : 'a. 'a t -> u

Currently, the branch uses option a).

Personal opinion: I prefer option b). This is for four reasons:

I) the scope of the explicitly quantified variable is not clear. For

example, how do you interpret:

type _ u = Bar of 'a. 'a : 'a t

or

type _ u = Bar : 'a. 'a -> 'a t

In one interpretation bar has type forall 'a 'b. 'a -> 'b t and in

another interpretation it has type forall 'a. 'a -> 'a t. My

inclination would be to flag it as an error.

II)

In the example of option b), the 'a variable is quantified as a

universal variable but, in patterns, it is used as an existential

variable. This is something I found very confusing in Haskell where

they actually use the 'forall' keyword.

III) option a) is the current Haskell GADT syntax and I've never heard

anyone complain about it

IIII) I don't see how option b) improves either readability or bug prevention

I look forward to hearing your opinions.

--Jacques Le Normand

_______________________________________________

Caml-list mailing list. Subscription management:

http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list

Archives: http://caml.inria.fr

Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

Bug reports: http://caml.inria.fr/bin/caml-bugs

Dec 4, 2010, 2:37:07 PM12/4/10

to Jacques Le Normand, caml-list caml-list

I prefer option a) for the constructor form, and b) for the quantification

syntax. I'm strongly sure that the explicit quantification is a much better

choice, while I think the first choice (constructor form) is more of a

matter of taste.

syntax. I'm strongly sure that the explicit quantification is a much better

choice, while I think the first choice (constructor form) is more of a

matter of taste.

I think the constructor form using "of" is better as the general function

syntax suggests currying is possible (| Foo : a -> b -> c), while it is not

possible for constructors (or you maybe it is, and then should support it).

"of foo * bar * baz" is not perfect (I prefer the revised syntax "of foo and

bar and baz"), but at least it's a consistent defect we're used to.

Dec 4, 2010, 2:39:53 PM12/4/10

to Jacques Le Normand, caml-list caml-list

On Sat, 4 Dec 2010 14:25:01 -0500

Jacques Le Normand <rathe...@gmail.com> wrote:

Jacques Le Normand <rathe...@gmail.com> wrote:

> Dear caml-list,

> I would like to start a constructive discussion on the syntax of GADT

> constructors of the ocaml gadt branch, which can be found at:

>

> https://sites.google.com/site/ocamlgadt/

>

> There are two separate issues:

>

> 1) general constructor form

>

> option a)

>

> type _ t =

> TrueLit : bool t

> | IntLit of int : int lit

>

> option b)

>

> type _ t =

> TrueLit : bool t

> | IntLit : int -> int lit

>

I would believe that we need more examples to choose the most readable

or natural syntax. Because both options have their valid point.

Maybe you might

1. provide a more convincing example, e.g. a 20 lines tutorial example,

and give it in both syntaxes. This let people choose really...

2. choose one syntax for the ocamlyacc (or menhir) based parser, and

provide an Ocamlp4 (or is it Ocamlp5) extension for the other syntax.

3. support both syntaxes at once, and have the user choose what fits him best.

Cheers.

--

Basile STARYNKEVITCH http://starynkevitch.net/Basile/

email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359

8, rue de la Faiencerie, 92340 Bourg La Reine, France

*** opinions {are only mine, sont seulement les miennes} ***

Dec 4, 2010, 2:42:07 PM12/4/10

to Jacques Le Normand, caml-list caml-list

Hi! Just a quick answer for now.

On Sat, Dec 4, 2010 at 8:25 PM, Jacques Le Normand <rathe...@gmail.com> wrote:

> Dear caml-list,

> I would like to start a constructive discussion on the syntax of GADT

> constructors of the ocaml gadt branch, which can be found at:

>

> https://sites.google.com/site/ocamlgadt/

>

> There are two separate issues:

>

> 1) general constructor form

>

> option a)

>

> type _ t =

> TrueLit : bool t

> | IntLit of int : int lit

>

> option b)

>

> type _ t =

> TrueLit : bool t

> | IntLit : int -> int lit

>

> I'm open to other options. The branch has used option b) from the

> start, but I've just switched to option a) to see what it's like

I like option (a) for consistency with the existing OCaml syntax, and

while I like option (b) for its conformance to standard notation, I

don't like your reasons for liking (b) ;-)

> I slightly prefer option b), because it makes it clear that it's a

> gadt constructor right from the start. This is useful because the type

> variables in gadt constructors are independent of the type parameters

> of the type, consider:

>

> type 'a t = Foo of 'a : 'b t

>

> this, counter intuitively, creates a constructor Foo of type forall 'd

> 'e. 'd t -> 'e t.

I think that the scope should propagate, i.e. that somehow the 'a

should really be bound, giving

Foo : forall 'a . 'a t -> 'a t

> 2) explicit quantification of existential variables

I don't even like the problem formulation. I think that existential

variables should not be differentiated from universal variables. (So I

think I like what you don't like about the Haskell solution).

>

> option a)

>

> leave existential variables implicitly quantified. For example:

>

> type _ u = Bar of 'a t : u

> or

> type _ u = Bar : 'a t -> u

>

> option b)

>

> specifically quantify existential variables. For example:

>

> type _ u = Bar of 'a. 'a t : u

> or

> type _ u = Bar : 'a. 'a t -> u

>

> Currently, the branch uses option a).

For me, it is a question for _all_ variables whether be implicitly or

explicitly quantified...

> I) the scope of the explicitly quantified variable is not clear. For

> example, how do you interpret:

>

> type _ u = Bar of 'a. 'a : 'a t

> or

> type _ u = Bar : 'a. 'a -> 'a t

>

> In one interpretation bar has type forall 'a 'b. 'a -> 'b t and in

> another interpretation it has type forall 'a. 'a -> 'a t.

Of course the "forall 'a. 'a -> 'a t" as far as I'm concerned!

> II)

>

> In the example of option b), the 'a variable is quantified as a

> universal variable but, in patterns, it is used as an existential

> variable. This is something I found very confusing in Haskell where

> they actually use the 'forall' keyword.

It often happens in logic! You have two sides of the turnstyle...

I'm sorry if I sounded harsh, not my intention!

Best Regards.

Dec 4, 2010, 3:14:57 PM12/4/10

to Lukasz Stafiniak, caml-list caml-list

see below

having

type 'a t = Foo of 'a : 'b t

creating a constructor of type forall 'a. 'a -> 'a t is really

confusing since the user explicitly gave the return type of the

constructor as 'b t.

Dec 4, 2010, 3:54:26 PM12/4/10

to Lukasz Stafiniak, caml-list caml-list

you may want to name your type parameters because you can mix GADT

constructors with normal constructors. consider:

constructors with normal constructors. consider:

type 'a term =

| Ignore of 'a term : int term

| Embed of 'a

I don't think propagating type parameters is a good idea. The current

rule is: ignore type parameters in GADT constructors. This is simpler

than unifying the type parameters with the arguments of the return

type of the constructor.

On Sat, Dec 4, 2010 at 3:22 PM, Lukasz Stafiniak <luks...@gmail.com> wrote:

> On Sat, Dec 4, 2010 at 9:14 PM, Jacques Le Normand <rathe...@gmail.com> wrote:

>>

>> having

>>

>> type 'a t = Foo of 'a : 'b t

>>

>> creating a constructor of type forall 'a. 'a -> 'a t is really

>> confusing since the user explicitly gave the return type of the

>> constructor as 'b t.

>

> But why would the user have named 'a in the first place then. In the

> context of Foo, it should be assumed that

>

> 'a t = 'b t

Dec 4, 2010, 4:00:53 PM12/4/10

to Jacques Le Normand, caml-list caml-list

On Sat, Dec 4, 2010 at 9:54 PM, Jacques Le Normand <rathe...@gmail.com> wrote:

> you may want to name your type parameters because you can mix GADT

> constructors with normal constructors. consider:

>

> type 'a term =

> | Ignore of 'a term : int term

> | Embed of 'a

>

> I don't think propagating type parameters is a good idea. The current

> rule is: ignore type parameters in GADT constructors. This is simpler

> than unifying the type parameters with the arguments of the return

> type of the constructor.

> you may want to name your type parameters because you can mix GADT

> constructors with normal constructors. consider:

>

> type 'a term =

> | Ignore of 'a term : int term

> | Embed of 'a

>

> I don't think propagating type parameters is a good idea. The current

> rule is: ignore type parameters in GADT constructors. This is simpler

> than unifying the type parameters with the arguments of the return

> type of the constructor.

But it expands to:

> type 'a term =

> | Ignore of 'a term : int term

> | Embed of 'a : 'a term

Dec 4, 2010, 4:07:02 PM12/4/10

to Lukasz Stafiniak, caml-list caml-list

in this case, yes, but if you have constraints then it is different. consider:

type 'a term =

| Ignore of 'a : int term

| Embed of 'a

constraint 'a = int

this is different from:

type 'a term =

| Ignore of 'a : int term

| Embed of 'a : 'a term

constraint 'a = int

(in fact, an error is flagged on the second one)

Also, it can save the user some typing.

Dec 5, 2010, 3:10:31 AM12/5/10

to Jacques Le Normand, caml-list caml-list

On Sat, Dec 4, 2010 at 10:06 PM, Jacques Le Normand

<rathe...@gmail.com> wrote:

> in this case, yes, but if you have constraints then it is different. consider:

>

> type 'a term =

> | Ignore of 'a : int term

> | Embed of 'a

> constraint 'a = int

>

> this is different from:

>

> type 'a term =

> | Ignore of 'a : int term

> | Embed of 'a : 'a term

> constraint 'a = int

>

> (in fact, an error is flagged on the second one)

>

> Also, it can save the user some typing.

<rathe...@gmail.com> wrote:

> in this case, yes, but if you have constraints then it is different. consider:

>

> type 'a term =

> | Ignore of 'a : int term

> | Embed of 'a

> constraint 'a = int

>

> this is different from:

>

> type 'a term =

> | Ignore of 'a : int term

> | Embed of 'a : 'a term

> constraint 'a = int

>

> (in fact, an error is flagged on the second one)

>

> Also, it can save the user some typing.

I don't think that it is very different. Standard-language constraints

are outside of implications, so they apply to each branch. It already

behaves this way in the standard language, basically restricting the

type family to "int term". So I don't see why

> | Embed of 'a : 'a term

> constraint 'a = int

shouldn't mean that Embed is "basically" Embed of int : int term. My

position is that if a type variable should be treated as local to a

branch, it should be explicitly quantified (using the dot notation,

for example "Ignore of 'a. 'a : int term", without any exceptions to

"existential" variables). And that the patterns

type X =

| Branch of Y

and

type X =

| Branch of Y : X

should be equivalent (where X could for example be 'a term).

I would accept concessions to my general outlook on the grounds of

being conservative over standard OCaml programmer intuitions and

conciseness of code... when they apply...

Dec 5, 2010, 3:17:20 AM12/5/10

to Jacques Le Normand, caml-list caml-list

On Sun, Dec 5, 2010 at 9:10 AM, Lukasz Stafiniak <luks...@gmail.com> wrote:

>

> I don't think that it is very different. Standard-language constraints

> are outside of implications, so they apply to each branch. It already

> behaves this way in the standard language, basically restricting the

> type family to "int term". So I don't see why

>

>> | Embed of 'a : 'a term

>> constraint 'a = int

>

> shouldn't mean that Embed is "basically" Embed of int : int term.

>

> I don't think that it is very different. Standard-language constraints

> are outside of implications, so they apply to each branch. It already

> behaves this way in the standard language, basically restricting the

> type family to "int term". So I don't see why

>

>> | Embed of 'a : 'a term

>> constraint 'a = int

>

> shouldn't mean that Embed is "basically" Embed of int : int term.

[To clarify: and the whole family restricted to "int term".]

Dec 5, 2010, 3:26:01 AM12/5/10

to Jacques Le Normand, caml-list caml-list

> shouldn't mean that Embed is "basically" Embed of int : int term. My

> position is that if a type variable should be treated as local to a

> branch, it should be explicitly quantified (using the dot notation,

> for example "Ignore of 'a. 'a : int term", without any exceptions to

> "existential" variables). And that the patterns

>

> type X =

> | Branch of Y

>

> and

>

> type X =

> | Branch of Y : X

>

> should be equivalent (where X could for example be 'a term).

> position is that if a type variable should be treated as local to a

> branch, it should be explicitly quantified (using the dot notation,

> for example "Ignore of 'a. 'a : int term", without any exceptions to

> "existential" variables). And that the patterns

>

> type X =

> | Branch of Y

>

> and

>

> type X =

> | Branch of Y : X

>

> should be equivalent (where X could for example be 'a term).

To clarify again: modulo variable capture! That is, the pattern

> type X =

> | Branch of 'a. Y : X

will be different (unless 'a is not used in Y nor X).

Dec 5, 2010, 3:35:42 AM12/5/10

to Lukasz Stafiniak, caml-list caml-list

I'm not sure I see the point of this long discussion between Lukasz and

Jacques.

Jacques.

It seems everybody agree that it is a good idea to explicitely quantify the

constructor-specific type variables.

The question Jacques raised is wether, when we write (| Foo of 'a . S : T)

or (| Foo : 'a . S -> T), the 'a is quantified on S only, or on both S and

T.

It think we all agree that, for semantical reasons, quantification should be

scoped over both S and T. However, the (of S : T) syntax does not make it

very obvious, and this should be rightfully considered as a defect of this

syntax.

I'm under the impression that your intense debate is about the edge case

where :

1. we don't use explicit quantification of constructor-specific variables

2. we reuse the type parameter variables in the type of a GADT constructor

(so they're implicitly shadowed by the implicit quantifications, or maybe

not)

If we reject possibly 1. (and ask for explicit quantification), all is well.

If you want to consider option 1., then I think the edge case 2. is evil and

shoud result in a warning.

Dec 5, 2010, 7:22:09 PM12/5/10

to bluestorm, caml...@yquem.inria.fr

On 2010/12/05, at 17:35, bluestorm wrote:

> I'm not sure I see the point of this long discussion between Lukasz and Jacques.

>

> It seems everybody agree that it is a good idea to explicitely quantify the constructor-specific type variables.

>

> The question Jacques raised is wether, when we write (| Foo of 'a . S : T) or (| Foo : 'a . S -> T), the 'a is quantified on S only, or on both S and T.

>

> It think we all agree that, for semantical reasons, quantification should be scoped over both S and T. However, the (of S : T) syntax does not make it very obvious, and this should be rightfully considered as a defect of this syntax.

>

> I'm under the impression that your intense debate is about the edge case where :

> 1. we don't use explicit quantification of constructor-specific variables

> 2. we reuse the type parameter variables in the type of a GADT constructor (so they're implicitly shadowed by the implicit quantifications, or maybe not)

>

> If we reject possibly 1. (and ask for explicit quantification), all is well. If you want to consider option 1., then I think the edge case 2. is evil and shoud result in a warning.

Actually I'm not sure that fully explicit quantification is necessary, or even desirable.

The reason is that the gadt extension introduces actually two kinds of case-specific

type variables: universals and existentials.

Namely, in the type

type _ term =

Int : int -> int term

| Bool : bool -> bool term

| Lam : ('a -> 'b) -> ('a -> 'b) term

| App : ('a -> 'b) * 'a -> 'b term

the type variables 'a and 'b in Lam are universals, but

in App only 'b is universal, while 'a is existential.

Personally, I would prefer this to be written:

type _ term =

Int of int : int term

| Bool of bool : bool term

| Lam of ('a -> 'b) : ('a -> 'b) term

| App of 'a. ('a -> 'b) * 'a : 'b term

That is, use a syntax close to the current ocaml one, which makes

easy to quantify clearly existential variables, so that they cannot

be confused with universal ones. If we use the arrow syntax,

the natural scope for quantification includes the return type, which

is incorrect for existential types. And using the "of" syntax, it is hard

to quantify type variables appearing in the return type, so I think

this is better to leave the universals implicit.

Considering the definition

type 'a t = Foo of 'a : 'b t

I think that is should either unify 'a and 'b, being in practice

equivalent to "type 'a t = Foo of 'a", or flag an error because 'a is not

available in this branch. I might actually prefer the unification approach,

because this conforms to the intuition that the scope of 'a is the whole

type definition, but maybe people can come to understand that an

explicit return type overrides this.

Jacques Garrigue

Dec 6, 2010, 4:11:06 AM12/6/10

to caml...@yquem.inria.fr

Hi,

On Sat, Dec 04, 2010 at 08:25:07PM +0100, Jacques Le Normand wrote:

> option a)

> type _ t =

> TrueLit : bool t

> | IntLit of int : int lit

> option b)

> type _ t =

> TrueLit : bool t

> | IntLit : int -> int lit

For a:

A little bit easier to parse (as a Camlp5 designer), which is just

a parse of the 'of' part followed (or not) with the ':' and another

type. In the option b), in the "revised" syntax version, I decided

to also separate the constructor parameters with arrows, which forced

me to add a function separating the ending type, the parsing being

therefore more complicated.

For b:

1/ The difference between normal constructors and constructors with GADT

is very visible. All examples given here are often with definitions

which are relatively short, but I tried an example with constructors

with several long types and I like the idea of seing immediately

that they are GADTs, rather at the end of the definition line (or

lines).

2/ I like the idea of defininig them with a syntax like the one of

functions definitions, like 'val'. In the "revised" syntax version,

where the constructors parameters are in "curried" form, this is

even more readable (even if partially applied parameters are

not accepted in the OCaml compiler).

PS The latest version of Camlp5 works with Jacques' version. You can

download it at: http://pauillac.inria.fr/~ddr/camlp5/

PPS The version with GADT is very interesting to parse parsing rules. I

is what I was searching for many years. In Camlp[45], the EXTEND statement

generates calls to Obj.magic and types constraints to "extend" the

OCaml type system. (This cannot be simulated by first class modules.)

--

Daniel de Rauglaudre

http://pauillac.inria.fr/~ddr/

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu