Zen Practice on ATS

218 views
Skip to first unread message

Kiwamu Okabe

unread,
Dec 2, 2014, 11:28:34 PM12/2/14
to ats-lang-users, ats-lang-users
Hi all,

I think ATS beginner is difficult to understand Applied Type System.
Then, I write Zen Practice on ATS.

https://github.com/githwxi/ATS-Postiats/wiki/Zen-Practice-on-ATS

Is it correct?

Thank's,
--
Kiwamu Okabe at METASEPI DESIGN

gmhwxi

unread,
Dec 3, 2014, 1:04:47 AM12/3/14
to ats-lan...@googlegroups.com, ats-lan...@lists.sourceforge.net, kiw...@debian.or.jp
I assume that this kind of "philosophical" stuff is always very difficult
to translate.

ATS is a layered system: dynamics: statics: sorts. The statics is meant
to reason about dynamics. Say that you have an array of 0's and 1's. A type
for this array may state that there are equal number of 0's and 1's in
this array at a particular point (though how 0's and 1's are distributed
is unspecified). Your waterpipe analogy is also a good one: shape stays
but content can change.

Say you write a program in Python or Ruby. Once your program is constructed,
what should you do? Most likely, you want to run it so that you may find some bugs.
Then you try to fix the bugs and repeat the process. It is not surprising that such
languages put emphasis on unit testing.

In ATS, you can do the same but there is another option. You can try to refine the
types in your program so that you can flush out some (potential) bugs. What is so
attractive about this option is that you do not have to wait until your program is ready to
run. You can (and probably should) do it while you are developing your program.

So while types themselves are static, types assigned to values can be refined.
Compared to OCaml and Haskell, I think it is probably fair to say that ATS supports
type refinement to a much greater extent.

john skaller

unread,
Dec 3, 2014, 8:40:50 AM12/3/14
to gmhwxi, ats-lan...@googlegroups.com, ats-lan...@lists.sourceforge.net

On 03/12/2014, at 5:04 PM, gmhwxi wrote:

>
> So while types themselves are static, types assigned to values can be refined.
> Compared to OCaml and Haskell, I think it is probably fair to say that ATS supports
> type refinement to a much greater extent.

Although no expert and unsure how one would measure such extent,
I would point out Ocaml polymorphic variants, which I suspect are considerably
higher level than ATS typing.

However, there are problems. Here's a rough description.

Originally, polymorphic variants were flat. A polymorphic variant is just
an arbitrary set of variants (injections, type constructors ...).
Unlike ordinary variants where a union type had fixed variants.

So consider processing some cases A so that after we have
an overlapping set of cases B, with ordinary variants you have
to translate to corresponding but distinct variants, with polymorphic
variants you do not. This provides both run time performance advantage
as well as reducing coding load.

I was one of the first users, trying to use them in my compiler.
Compilers translate terms in many passes and using a fat set
of all possibilities as inputs and output for every pass is common
practice but it is easy to break invariants.

But flat polymorphic variants are of no use here because the terms a compiler
deals with are almost universally recursive.

Enter covariant polymorphic variants.

Consider a type

type t1 = [ `A of t1 | `B | `C]

and suppose you want to get rid of the C. Then you have a type

type t2 = [`A of t1 | `B ]

by throwing out the `C term, which is no good at all because the argument of
the `A constructor is still the old t1. We need to reduce that to t2.

The method for doing this is hard, we have to use open recursion:

type 'a t2' = [`A of 'a | `B]
type 'a t1' = ['a t2' | `C]
type t2 = 'a t2' as 'a
type t1 = 'a t1' as 'a

[I think that's right .. er .. hmm ]

Anyhow the key point is on line 2: we built t1' on top of t2' by replacing
the recursion with a type variable. This makes the terms "flat" but parametric.

Then we close the terms with recursion. so now t1 is a genuine subtype of t2.
We have to ensure covariance here because the processing function that removes
the `C term is recursive.

This is a simple example, which shows it isn't too hard. Very useful indeed
if you have a type with 20 or so variants (which I do in my compiler).

But there's a problem, and it's a killer. There isn't just one type involved,
there are 6 or more mutually recursive types.

So now, you get a combinatorial explosion, you need 6 or more parameters
in the open types and closing them precisely is well nigh impossible.

In the end I gave up and went back to "assert false" on branches that
shouldn't occur. Static checking would be much better but it would
make the code very difficult to get right, and also very fragile.

Furthermore, the usage in explosive/propagating. Lets suppose a term

Symbol of int * descr | Literal of float

is used somewhere and you refine type descr as you process these terms ..
then you have to refine the above type as well. Refinements propagate to all
parents (including self if the type is recursive).

ATS typing is quite different of course and refinement isn't done the same way,
but the issue above with refinement would seem to be a universal issue.

Actually there's another example, well known: take pointer types in C and
throw in "const". That broke almost every program in existence including
pretty much the whole C Standard library. C++ was designed with const
already there but it introduced a lot of complexity. and the
concept really broken down with templates.

Another interesting example is Posix. No one can be happy with the fact
everything is an int: file descriptors, sockets, error codes, thread ids ..
even Posix isn't happy about this and uses typedefs to try to distinguish.
In the Felix bindings for Posix, I can and do use stronger typing,
but always it creates a mess because each function has strange special
cases and overlaps that don't fit any sane typing scheme.

I'm really interested to see how ATS handles these issues. Refinements
have to propagate or they're useless, but not too far or they render the
code too difficult to write and modify. They need to be contained somehow,
for example by hiding them behind an interface (abstraction).


--
john skaller
ska...@users.sourceforge.net
http://felix-lang.org



gmhwxi

unread,
Dec 3, 2014, 2:44:56 PM12/3/14
to ats-lan...@googlegroups.com, gmh...@gmail.com, ats-lan...@lists.sourceforge.net

Hi John,

I really admire your ability to compose well-written long messages *rapidly*.
It takes me forever to write such messages :(

When I wrote that ATS can support type-refinement to a much greater extent
when compared to OCaml and Haskell, I had the following thought:

The basis of the type systems of OCaml and Haskell is Hindley-Milner.
Hindley-Milner type system supports the notion of principal types. That is,
every expression can be assigned a type that is the most general for that
expression. While this may sound great (especially for the purpose of type
interference), it clearly implies that Hindley-Milner types cannot be very
"expressive". There is really no type-refinement here because you can already,
for good or bad, get the so-called most general types for expressions.

I am not aware that either OCaml or Haskell preaches a programming style
based on type-refinement. I will talk about the problem you raised in your
message later.

Say I have function foo in ATS of the following type:

fun foo: int -> bool

A refinement in this case is to use a more accurate dependent type:

fun foo: {n:nat} int(n) -> bool // foo should only be applied to natural numbers.

Maybe the 'int' stands for an opened file descriptor:

fun foo: {n:nat} (!file_descriptor_v(n) | int(n)) -> bool

These are typical type-refinements you can do in ATS.

john skaller

unread,
Dec 3, 2014, 5:39:38 PM12/3/14
to gmhwxi, ats-lan...@googlegroups.com, ats-lan...@lists.sourceforge.net

On 04/12/2014, at 6:44 AM, gmhwxi wrote:

>
> Hi John,
>
> I really admire your ability to compose well-written long messages *rapidly*.
> It takes me forever to write such messages :(

If only I could write *programs*that quickly :-)

>
> When I wrote that ATS can support type-refinement to a much greater extent
> when compared to OCaml and Haskell, I had the following thought:
>
> The basis of the type systems of OCaml and Haskell is Hindley-Milner.
> Hindley-Milner type system supports the notion of principal types. That is,
> every expression can be assigned a type that is the most general for that
> expression. While this may sound great (especially for the purpose of type
> interference), it clearly implies that Hindley-Milner types cannot be very
> "expressive". There is really no type-refinement here because you can already,
> for good or bad, get the so-called most general types for expressions.
>
> I am not aware that either OCaml or Haskell preaches a programming style
> based on type-refinement. I will talk about the problem you raised in your
> message later.

I don't know about "preaching" but a lot of the more recent and advanced
extensions to Ocaml require type specifications: inference doesn't work.
For polymorphic variants it works sometimes and not others.

If you're using them, you're better off naming the type anyhow because if you get
an error .. well the error messages are unreadable, and you can't even be
sure it's an error, it could just be a failure of inference.

Ocaml also supports second order polymorphism (i.e. passing genuinely
polymorphic functions as values) and of course HM inference doesn't
allow that so again, you have to use explicit typing.

You could always do that with record types but I think now you can write

let eval ('a . f : 'a -> int) x y = f x, f y in
f 1 "hello"

Of course here the type of f is MORE general than HM inference allows :)

>
> Say I have function foo in ATS of the following type:
>
> fun foo: int -> bool
>
> A refinement in this case is to use a more accurate dependent type:
>
> fun foo: {n:nat} int(n) -> bool // foo should only be applied to natural numbers.
>
> Maybe the 'int' stands for an opened file descriptor:
>
> fun foo: {n:nat} (!file_descriptor_v(n) | int(n)) -> bool
>
> These are typical type-refinements you can do in ATS.

These are primarily subsets of integers.

A similar case which might be supported: subsets of string type
specified by regular expressions? There is a calculus of regular
sets which might be utilised.

However these are "flat" refinements. How about recursive ones
as with the Ocaml polymorphic variants? How would you use refinement
to manage the type of terms produced and consumed by passes of
a compiler?

gmhwxi

unread,
Dec 3, 2014, 7:56:34 PM12/3/14
to ats-lan...@googlegroups.com, ats-lan...@lists.sourceforge.net, kiw...@debian.or.jp
Ocaml also supports second order polymorphism (i.e. passing genuinely
polymorphic functions as values) and of course HM inference doesn't
allow that so again, you have to use explicit typing.

You could always do that with record types but I think now you can write

        let eval  ('a . f : 'a -> int) x y = f x, f y in
        f 1 "hello"
Of course here the type of f is MORE general than HM inference allows :)

Generally speaking, polymorphism tries to make more programs typable
while type-refinement aims at assigning more accurate types to programs that
are already typable. I often see a highly perverted style of programming in OCaml
and/or Haskell that tries to address in higher-order logic what should really be
done in the first-order logic. Of course, many people see this as a "cool" way of
doing things. However, when something seems so "cool", it is usually a telling sign of
underlying unnaturalness.


On Tuesday, December 2, 2014 11:28:34 PM UTC-5, Kiwamu Okabe wrote:

Barry Schwartz

unread,
Dec 3, 2014, 8:21:14 PM12/3/14
to ats-lan...@googlegroups.com
gmhwxi <gmh...@gmail.com> skribis:
> However, when something seems so "cool", it is usually a
> telling sign of underlying unnaturalness.

That seems very much how I have come to look at programming issues in
recent years.

My ‘favorite’ (because immediately apparent) examples are some of the
technically ‘cool’ but utterly unnatural ways Python deals with
character strings, such as making a particular formatting algorithm an
innate behavior of strings, or treating joining a list of strings as
an innate behavior of, say, the null string. Instead I ask _why_ are
we doing it that way.

Once upon a time I’d have thought ‘That’s cool!’ instead of ‘That’s
horrible!’ like I do now. :)

gmhwxi

unread,
Dec 3, 2014, 8:53:23 PM12/3/14
to ats-lan...@googlegroups.com, gmh...@gmail.com, ats-lan...@lists.sourceforge.net
>>However these are "flat" refinements. How about recursive ones
as with the Ocaml polymorphic variants? How would you use refinement
to manage the type of terms produced and consumed by passes of
a compiler?

Say we have:

datatype T1 = A1 of () | B1 of t1
datatype T2 = A2 of () | B2 of t2 | C2 of (t2, t2)

There is an interesting relation between T1 and T2, that is,
a value of the type T1 can be *naturally* treated as a value of T2:

fun from_T1_to_T2(x: T1): T2 =
  case x of
  | A1 () => A2 ()
  | B1 (x1) => B2 (from_t1_to_t2(x1))

If tag(A1)=tag(A2) and tag(B1)=tag(B2), then from_T1_to_T2 can be
replaced with the identity function.

For now, I would just introduce a cast function to go from T1 to T2. I treat
this issue as a form of syntactic convenience (instead of something of semantic
depth).

gmhwxi

unread,
Dec 3, 2014, 9:09:46 PM12/3/14
to ats-lan...@googlegroups.com, gmh...@gmail.com, ats-lan...@lists.sourceforge.net
Say

datatype T3 = A3 of () | C3 of (T2, T3) // yes, T2 and T3
Then a value of T3 can also be naturally treated as a value of T2.
However, the ATS compiler uses different tags for C2 and C3.

Two solutions:

1) Force the compiler uses the same tag for C3
2) Find a way around this:

abstype bottom // for no values
datatype T3 = A3 of () | B3 of bottom | C3 of (T2, T3)
castfn from_T3_to_T2: T3 -> T2

I have never tried the first solution. I use the second one in my code.

gmhwxi

unread,
Dec 3, 2014, 9:40:25 PM12/3/14
to ats-lan...@googlegroups.com, gmh...@gmail.com, ats-lan...@lists.sourceforge.net

Furthermore, the usage in explosive/propagating. Lets suppose a term

        Symbol of int * descr | Literal of float

is used somewhere and you refine type descr as you process these terms ..
then you have to refine the above type as well. Refinements propagate to all
parents (including self if the type is recursive).

You write too much compiler code :)

When you do type-refinement, you may need to introduce some abstract types
and cast functions to stop unwanted propagation. However, I would say that propagation
is often what you need in order to flush out potential bugs. Here is a typical scenario:

You use the un-indexed type Array(T) and run-time checks in your code. Later, you
replace Array(T) with array(T, n); this change is likely to generates many type-errors;
fixing these type-errors can lead the elimination of many run-time array-bounds checks.

Doing type-refinement in ATS means that you often need to use unsafe features in order
to be practical. But this is far better, in my opinion, than claiming well-typedness in a setting
where types are so inaccurate.

On Wednesday, December 3, 2014 8:40:50 AM UTC-5, John Skaller wrote:

john skaller

unread,
Dec 3, 2014, 10:15:22 PM12/3/14
to gmhwxi, ats-lan...@googlegroups.com, ats-lan...@lists.sourceforge.net

On 04/12/2014, at 12:53 PM, gmhwxi wrote:

> For now, I would just introduce a cast function to go from T1 to T2. I treat
> this issue as a form of syntactic convenience (instead of something of semantic
> depth).

It's sometimes not clear, to me at least, the difference between "syntactic
convenience" and "semantics".

One could argue category theory is largely a huge rats nest of syntactic
manipulation of no semantic value.

On the other hand, as a language designer I find it quite challenging to
match up basic operations with good syntax: ultimately the bulk of work
in any programming task is syntactic drudgery and reducing the housekeeping
and allowing the clean expression of ideas is quite important.

For example overloading (in Felix anyhow) is just a syntactic convenience
as it is in (template free C++). Yet it is a vital convenience to write

a + b

for addition when you have 26 distinct types of integer. Ocaml has no
overloading and is close to useless in many contexts for that reason.

ATS, unusually, has distinctly named functions which are separately
overloaded which is really cool.

john skaller

unread,
Dec 3, 2014, 10:31:36 PM12/3/14
to gmhwxi, ats-lan...@googlegroups.com, ats-lan...@lists.sourceforge.net

On 04/12/2014, at 1:09 PM, gmhwxi wrote:

> castfn from_T3_to_T2: T3 -> T2
>
> I have never tried the first solution. I use the second one in my code.

However, just the type conversion isn't enough. You also need a
"print" routine at least for any terms for diagnostics.

For 2 or 3 variants this isn't an issue but consider this
which is the actual unbound AST term for a type in Felix (Ocaml notation):

and typecode_t =
| TYP_label
| TYP_void of Flx_srcref.t (** void type *)
| TYP_name of Flx_srcref.t * Flx_id.t * typecode_t list
| TYP_case_tag of Flx_srcref.t * int
| TYP_typed_case of Flx_srcref.t * int * typecode_t
| TYP_lookup of Flx_srcref.t * (expr_t * Flx_id.t * typecode_t list)
| TYP_index of Flx_srcref.t * string * index_t
| TYP_callback of Flx_srcref.t * qualified_name_t
| TYP_suffix of Flx_srcref.t * (qualified_name_t * typecode_t)
| TYP_patvar of Flx_srcref.t * Flx_id.t
| TYP_patany of Flx_srcref.t
| TYP_tuple of typecode_t list (** product type *)
| TYP_unitsum of int (** sum of units *)
| TYP_sum of typecode_t list (** numbered sum type *)
| TYP_intersect of typecode_t list (** intersection type *)
| TYP_record of (Flx_id.t * typecode_t) list (** anon product *)
| TYP_variant of (Flx_id.t * typecode_t) list (** anon sum *)
| TYP_function of typecode_t * typecode_t (** function type *)
| TYP_cfunction of typecode_t * typecode_t (** C function type *)
| TYP_pointer of typecode_t (** pointer type *)
| TYP_array of typecode_t * typecode_t (** array type base ^ index *)
| TYP_as of typecode_t * Flx_id.t (** fixpoint *)
| TYP_typeof of expr_t (** typeof *)
| TYP_var of index_t (** unknown type *)
| TYP_none (** unspecified *)
| TYP_ellipsis (** ... for varargs *)
(* | TYP_lvalue of typecode_t *) (** ... lvalue annotation *)
| TYP_isin of typecode_t * typecode_t (** typeset membership *)

(* sets of types *)
| TYP_typeset of typecode_t list (** discrete set of types *)
| TYP_setunion of typecode_t list (** union of typesets *)
| TYP_setintersection of typecode_t list (** intersection of typesets *)

(* dualizer *)
| TYP_dual of typecode_t (** dual *)

| TYP_apply of typecode_t * typecode_t (** type function application *)
| TYP_typefun of simple_parameter_t list * typecode_t * typecode_t
(** type lambda *)
| TYP_type (** meta type of a type *)
| TYP_type_tuple of typecode_t list (** meta type product *)

| TYP_type_match of typecode_t * (typecode_t * typecode_t) list
| TYP_type_extension of Flx_srcref.t * typecode_t list * typecode_t
| TYP_tuple_cons of Flx_srcref.t * typecode_t * typecode_t

and now here are all the unbound kinds of expressions:

and expr_t =
| EXPR_label of Flx_srcref.t * string
| EXPR_vsprintf of Flx_srcref.t * string
| EXPR_interpolate of Flx_srcref.t * string
| EXPR_map of Flx_srcref.t * expr_t * expr_t
| EXPR_noexpand of Flx_srcref.t * expr_t
| EXPR_name of Flx_srcref.t * Flx_id.t * typecode_t list
| EXPR_index of Flx_srcref.t * string * index_t
| EXPR_case_tag of Flx_srcref.t * int
| EXPR_typed_case of Flx_srcref.t * int * typecode_t
| EXPR_projection of Flx_srcref.t * int * typecode_t
| EXPR_lookup of Flx_srcref.t * (expr_t * Flx_id.t * typecode_t list)
| EXPR_apply of Flx_srcref.t * (expr_t * expr_t)
| EXPR_tuple of Flx_srcref.t * expr_t list
| EXPR_tuple_cons of Flx_srcref.t * expr_t * expr_t
| EXPR_record of Flx_srcref.t * (Flx_id.t * expr_t) list
| EXPR_record_type of Flx_srcref.t * (Flx_id.t * typecode_t) list
| EXPR_variant of Flx_srcref.t * (Flx_id.t * expr_t)
| EXPR_variant_type of Flx_srcref.t * (Flx_id.t * typecode_t) list
| EXPR_arrayof of Flx_srcref.t * expr_t list
| EXPR_coercion of Flx_srcref.t * (expr_t * typecode_t)
| EXPR_suffix of Flx_srcref.t * (qualified_name_t * typecode_t)
| EXPR_patvar of Flx_srcref.t * Flx_id.t
| EXPR_patany of Flx_srcref.t
| EXPR_void of Flx_srcref.t
| EXPR_ellipsis of Flx_srcref.t
| EXPR_product of Flx_srcref.t * expr_t list
| EXPR_sum of Flx_srcref.t * expr_t list
| EXPR_intersect of Flx_srcref.t * expr_t list
| EXPR_isin of Flx_srcref.t * (expr_t * expr_t)
| EXPR_orlist of Flx_srcref.t * expr_t list
| EXPR_andlist of Flx_srcref.t * expr_t list
| EXPR_arrow of Flx_srcref.t * (expr_t * expr_t)
| EXPR_longarrow of Flx_srcref.t * (expr_t * expr_t)
| EXPR_superscript of Flx_srcref.t * (expr_t * expr_t)
| EXPR_literal of Flx_srcref.t * Flx_literal.literal_t
| EXPR_deref of Flx_srcref.t * expr_t
| EXPR_ref of Flx_srcref.t * expr_t
| EXPR_likely of Flx_srcref.t * expr_t
| EXPR_unlikely of Flx_srcref.t * expr_t
| EXPR_new of Flx_srcref.t * expr_t
| EXPR_callback of Flx_srcref.t * qualified_name_t
| EXPR_lambda of Flx_srcref.t * (funkind_t * vs_list_t * params_t list * typecode_t * statement_t list)
| EXPR_range_check of Flx_srcref.t * expr_t * expr_t * expr_t
| EXPR_not of Flx_srcref.t * expr_t

(* this boolean expression checks its argument is
the nominated union variant .. not a very good name for it
*)
| EXPR_match_ctor of Flx_srcref.t * (qualified_name_t * expr_t)

(* this boolean expression checks its argument is the nominate
sum variant
*)
| EXPR_match_case of Flx_srcref.t * (int * expr_t)

(* this extracts the argument of a named union variant -- unsafe *)
| EXPR_ctor_arg of Flx_srcref.t * (qualified_name_t * expr_t)

(* this extracts the argument of a number sum variant -- unsafe *)
| EXPR_case_arg of Flx_srcref.t * (int * expr_t)

(* this just returns an integer equal to union or sum index *)
| EXPR_case_index of Flx_srcref.t * expr_t (* the zero origin variant index *)

| EXPR_letin of Flx_srcref.t * (pattern_t * expr_t * expr_t)

| EXPR_get_n of Flx_srcref.t * (int * expr_t)
| EXPR_get_named_variable of Flx_srcref.t * (Flx_id.t * expr_t)
| EXPR_as of Flx_srcref.t * (expr_t * Flx_id.t)
| EXPR_as_var of Flx_srcref.t * (expr_t * Flx_id.t)
| EXPR_match of Flx_srcref.t * (expr_t * (pattern_t * expr_t) list)

(* this extracts the tail of a tuple *)
| EXPR_get_tuple_tail of Flx_srcref.t * expr_t
| EXPR_get_tuple_head of Flx_srcref.t * expr_t

| EXPR_typeof of Flx_srcref.t * expr_t
| EXPR_cond of Flx_srcref.t * (expr_t * expr_t * expr_t)

| EXPR_expr of Flx_srcref.t * string * typecode_t

| EXPR_type_match of Flx_srcref.t * (typecode_t * (typecode_t * typecode_t) list)

| EXPR_extension of Flx_srcref.t * expr_t list * expr_t

we also have patterns, statements, a subset of the statement which
are executable instructions ...

Apart from printing all these terms, there are also functions to find
the original source location (in the users file), iterators, and maps.

And this is just the raw input AST, which then goes through several
transformations until we end up with bound terms (ones where
string names are replaced by referenced to the symbol to which they
refer, i.e. with lookup done). Then the bound terms go through several
optimisation, transformation and reduction processes. Because my compiler
translates to C++ things one would normally erase, such as source locations
and type information, have to be retained all the way through.

So some way to have a term such as those above with constraints that
can change as we go through some process .. without having to use a new
type for it, and thereby not only make a slightly modified copy of the term, but
also a new printing function, a new iterator, a new mapper ... would seem
ideal (for this particular task).

As it is I often write checking functions for the constraints, but this sucks
compared to static typing.

john skaller

unread,
Dec 3, 2014, 10:44:05 PM12/3/14
to gmhwxi, ats-lan...@googlegroups.com, ats-lan...@lists.sourceforge.net

On 04/12/2014, at 1:40 PM, gmhwxi wrote:

>
> You write too much compiler code :)

It's more fun that SQL ...

>
> When you do type-refinement, you may need to introduce some abstract types
> and cast functions to stop unwanted propagation. However, I would say that propagation
> is often what you need in order to flush out potential bugs. Here is a typical scenario:
>
> You use the un-indexed type Array(T) and run-time checks in your code. Later, you
> replace Array(T) with array(T, n); this change is likely to generates many type-errors;
> fixing these type-errors can lead the elimination of many run-time array-bounds checks.
>
> Doing type-refinement in ATS means that you often need to use unsafe features in order
> to be practical. But this is far better, in my opinion, than claiming well-typedness in a setting
> where types are so inaccurate.

Yes, I agree with this. And I see how, given we refine a type which then spews
a lot of type errors, we can systematically go through them and do one of three things:

(a) fix the type by proof the additional constraint is met
(b) prove the constraint is met by a run time check
(c) cast away the constraint

Hopefully the second two "patches" can be revisited later,
and a static proof used to replace the run time check
or the assertion.

So a core part of this development paradigm would be having
a systematic way of identifying places where errors have been
quelled by (b) or (c) so we can improve confidence in correctness
and/or performance.

This would allow "piecemeal" improvement of the code.

john skaller

unread,
Dec 3, 2014, 11:46:12 PM12/3/14
to gmhwxi, ats-lan...@googlegroups.com, ats-lan...@lists.sourceforge.net

On 04/12/2014, at 1:09 PM, gmhwxi wrote:

> Say
>
> datatype T3 = A3 of () | C3 of (T2, T3) // yes, T2 and T3

What exactly does T2, T3 mean?

If that's an intersection, why not use \cap instead of , for the operator?

William ML Leslie

unread,
Dec 3, 2014, 11:59:23 PM12/3/14
to ats-lan...@lists.sourceforge.net, ats-lan...@googlegroups.com
On 4 December 2014 at 15:45, john skaller <ska...@users.sourceforge.net> wrote:
>
> On 04/12/2014, at 1:09 PM, gmhwxi wrote:
>
>> Say
>>
>> datatype T3 = A3 of () | C3 of (T2, T3) // yes, T2 and T3
>
> What exactly does T2, T3 mean?
>
> If that's an intersection, why not use \cap instead of , for the operator?

It's a product. Parameters to the constructor C3 have types T2 and T3.

--
William Leslie

Notice:
Likely much of this email is, by the nature of copyright, covered
under copyright law. You absolutely MAY reproduce any part of it in
accordance with the copyright law of the nation you are reading this
in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without
prior contractual agreement.

gmhwxi

unread,
Dec 4, 2014, 12:00:10 AM12/4/14
to ats-lan...@googlegroups.com, gmh...@gmail.com, ats-lan...@lists.sourceforge.net

datatype T2 = A2 of () | B2 of T2 | C2 of (T2, T2)

You can have

datatype T3 = A3 of () | C3 of (T3, T3)

or you can have

datatype T3_1 = A3 of () | C3 of (T2, T3_1)

or you can have

datatype T3_2 = A3 of () | C3 of (T3_2, T2)

T3, T3_1, and T3_2 can all be *naturally* embedded into T2
(in terms of data representation). That is why I said that I saw this
as a form of syntactic convenience (as I did not (and do not) see how
T3, T3_1 or T3_2 is semantically related to T2).

gmhwxi

unread,
Dec 4, 2014, 2:25:11 AM12/4/14
to ats-lan...@googlegroups.com, gmh...@gmail.com, ats-lan...@lists.sourceforge.net

>>For 2 or 3 variants this isn't an issue but consider this
which is the actual unbound AST term for a type in Felix (Ocaml notation):

Say you have datatypes T1 and T2; each of them contains 100 constructors.
Due to the similarity between T1 and T2, you can merge T1 and T2 into T12,
which, say, contains 120 constructors. At this point we have two cast functions:

castfn T1_to_T12: T1 -> T12
castfn T2_to_T12: T2 -> T12

Say you implement:

fun print_T12: T12 -> void

Then you can also implement:

fun print_T1 (x: T1): void = print (T1_to_T12(x))
fun print_T2 (x: T2): void = print (T2_to_T12(x))

Things, however, can get trickier. Say you have a
constructor Cons, and you want print_T1 and print_T2
to handle Cons in different ways. I think I will be able
to deal with this issue by making use of templates.

I will try to code up an example and post it on-line.

gmhwxi

unread,
Dec 4, 2014, 2:42:39 AM12/4/14
to ats-lan...@googlegroups.com, gmh...@gmail.com, ats-lan...@lists.sourceforge.net
Here is a working example:

https://github.com/githwxi/ATS-Postiats/blob/master/doc/EXAMPLE/ATS-QA-LIST/qa-list-2014-12-04.dats

I will do the template version at another time.

gmhwxi

unread,
Dec 5, 2014, 2:20:46 AM12/5/14
to ats-lan...@googlegroups.com, gmh...@gmail.com, ats-lan...@lists.sourceforge.net

I managed to work out another example involving sharing of datatype constructors:

https://github.com/githwxi/ATS-Postiats/blob/master/doc/EXAMPLE/ATS-QA-LIST/qa-list-2014-12-04-2.dats

This one is template-based. It is a lot trickier, showing it is not only possible to share but also possible to differ
(when needed). I am not really sure if it is worth the effort in this kind of situation. Anyway, I do like this example
quite a bit.

john skaller

unread,
Dec 12, 2014, 9:16:39 PM12/12/14
to gmhwxi, ats-lang-users
I would like to learn to use ATS and also get some help using it
for some things.

1. Can ATS be used to produce a stand alone C library?

This means: I will specify the functions which must be provided.
They will have extern "C" linkage.

I am thinking of Judy. Judy provides a cache tuned digital trie.
It is the fasted mutable associative store in existence which
provides both random and sequential access. There are 3 primary
data types: word -> bit, word -> word, and NTBS -> word.
[NTBS: null terminated string]

It also scales to terrabyte store without any issues.
All core operations on Judy are O(1) (O(N) for NTBS where
N is the string length).

Judy is slower at some sizes for random access compared
to hash tables (but hash tables can't do sequential access).
OTOH Judy only handles word keys (hash is more general,
it handles anything with a comparison and hash function).

However the code is horrible, full of macros, and a lot of work would be
required to rewrite it in ATS. We'd use templates to replace the macros.
And we'd want to use the type system to prove correctness.

This would be an interesting job to do because Judy is the best.
And it isn't used much for some reason (perhaps because the code
is unmaintainable :)

Judy does have one downside in a modern context: it is not compatible
with a plugin garbage collector because it is a digital trie (keys are
split into non-contiguous byte streams which means some pointers
to objects will be invisible to a collector not trained in Judy).
My own GC knows about Judy though :)

2. I have another library which I would think is suitable for
conversion to ATS: a precise garbage collector. Which uses
Judy mentioned above. However it is currently a C++ library.
The virtual dispatch is actually used as follows: the allocator
used is abstract. The GC itself is split into an abstraction and
implementation, the core implementation is not thread safe.
A wrapper implementation is provides as well which is thread safe.

To convert to C is possible, just using a record of function pointers.

The GC is not entirely trivial. It's a precise naive mark/sweep collector
driven by compiler generated tables. The main sweep mechanism
uses an array of offsets to find pointers. Judy arrays provide repetition
counts to allow variable length arrays to be handled.

When sweeping, the recursive descent *posts* pointers into a buffer
for later analysis until a fixed recursion depth is reached. Then the
scan falls out and the analysis routine runs through the buffer.
This mechanism is complex, a proof of correctness would be really
valuable.

This GC could also be used with ATS is the ATS compiler could be taught
how to generate the RTTI tables required to run the GC.

Unlike Judy, which it relies upon, the collector is not a very large
piece of code. The longest function is about one page.

3. I have a number of other libraries which would be better written
in ATS than C++ (and also some I do NOT have which I would like
to write).

4. More difficult: my compiler (Felix) generates C++ from a high level
language. It is basically a souped up type safe macro processor
which uses libraries to define Felix types and functions in terms of C++ types
and functions. Then it churns away and emits C++.

It would be good to emit ATS as well. The ATS code would then be translated
by the ATS compiler into C which would then be compiled as C++ into a library
which is linked to the rest of the code. It's *essential* that ATS compiler emits
C code which is also compliant C++ code (the primary requirement is to note
that C++ REQUIRES an explicit cast converting from a void * to another pointer
type, C doesn't: there are some other issues: C++ has strongly typed enums
for example, C doesn't).

This is a particularly hard project because it would be useless unless
Felix can be trained to emit suitable types and proof values.
[The Felix compiler doesn't have to understand them, but it does have
to be able to sequence and compose them]

All these points above require one thing: ATS has to be able to generate

(a) libraries (not programs)
(b) functions with specified names and interfaces
(c) extern "C" wrappers
(d) C++ compilant C code

I believe ATS can do (a) and (b). (c) is a trivial mod.
(d) would require rewriting the compiler as required to ensure
casts are used where required, and enums are used in a suitable
way (enums in C++ are NOT integers).

Brandon Barker

unread,
Dec 12, 2014, 10:01:50 PM12/12/14
to ats-lang-users
On Fri, Dec 12, 2014 at 9:15 PM, john skaller <ska...@users.sourceforge.net> wrote:
I would like to learn to use ATS and also get some help using it
for some things.

1. Can ATS be used to produce a stand alone C library?

This means: I will specify the functions which must be provided.
They will have extern "C" linkage.

Yes. Here is an older post about it:

A lot of little projects with 'MYPORTDIR' subdirectories support this style.

For the C interface, see here:


I just checked here and it looks like this could stand for some updating:

 



--
You received this message because you are subscribed to the Google Groups "ats-lang-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-user...@googlegroups.com.
To post to this group, send email to ats-lan...@googlegroups.com.
Visit this group at http://groups.google.com/group/ats-lang-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ats-lang-users/825FAAD9-F0E2-4519-993C-B9D160EDA7CF%40users.sourceforge.net.


--
Brandon Barker
brandon...@gmail.com

Brandon Barker

unread,
Dec 12, 2014, 10:14:04 PM12/12/14
to ats-lang-users

john skaller

unread,
Dec 12, 2014, 11:41:18 PM12/12/14
to Brandon Barker, ats-lang-users

On 13/12/2014, at 2:59 PM, Brandon Barker wrote:

> In the wiki article there is this, which I think is what you are referring to:
>
> Make an ATS function available in C (can be implemented in ATS or C):
>
> extern
> fun myfun (n: int, res: int): int = "ext#myfun_in_c" (* call as myfun_in_c(int, int) in C *)
>
> (* Have the C function name be the same as the ATS function name: *)
> extern
> fun myfun (n: int, res: int): int = "ext#" (* call as myfun(int, int) in C *)

That's not enough in C++. You have to say

// in header file:
extern "C" void name (int);

// in body code:
void name ( int x) { ... }

But its WORSE. In both C and C++ you ALSO have to say something like:


#if !defined(FLX_STATIC_LINK) && FLX_WIN32
#define FLX_EXPORT __declspec(dllexport)
#define FLX_IMPORT __declspec(dllimport)
#else
#define FLX_EXPORT
#define FLX_IMPORT
#endif


#ifdef BUILD_RTL
#define RTL_EXTERN FLX_EXPORT
#else
#define RTL_EXTERN FLX_IMPORT
#endif

Here "BUILD_RTL" has to be supplied as a C/C++ compiler switch when building
the DLL and not otherwise. FLX_STATIC_LINK has to be supplied if compiling
for static linkage (and not otherwise).

Now the functions have to be declared like

RTL_EXTERN void flx_trace(flx_trace_t* tr,flx_range_srcref_t sr, char const *file, int line, char const *msg);

in C++ for a C++ function and

extern "C"

as well if you want to use C liker names. This is usually done like:


#ifdef __cplusplus /* support use by C++ code */
extern "C" {
#endif

RTL_EXTERN void flx_trace(flx_trace_t* tr,flx_range_srcref_t sr, char const *file, int line, char const *msg);

#ifdef __cplusplus
}
#endif

It used to be only Windows required dllimport/dllexport but this is NOT so any more.
Now GNU linkers (finally) have visibility control, you have to do it with all modern gcc
systems (i.e. all Unix, OSX and Linux) platforms too (unless you want to export everything!)

So clearly this is all non-trivial and it HAS to be supported by ATS compiler.
I'm pretty sure ATS compiler cannot do the DLL import/export at present.
In particular

(a) the ATS to C compiler has to generate the appropriate code AND

(b) the C/C++ compiler/linker toolchain invoked by ATS driver
program has to provide the required STATIC_LINK and BUILD_LIB switches
as well.

I don't care about (b), I can organise C/C++ compile/link myself.

All this is in addition to ensuring the C code works under C++ compiler as well.
(no implicit casts from void * allowed, no assuming enums are ints, no funny business
with bool either .. )

Even if you compile the library with C, the header files MUST work in C++
as well as C.

getting all this right is not hard but the method is NOT obvious: more than half the
open source projects in existence get it wrong.

The macro method for building DLLs above is the ONLY correct way to do it.

In turn this means when you're translating a library from ATS to C you MUST
supply the build macro name to the ATS translator. Which I don't think ATS
supports at present either.

So .. I may be wrong .. but I do not think ATS can generate libraries yet.

Just as an aside, although messy this is the correct protocol:

export sym; // in provider
import sym; // in consumer

C, as usual, gets everything wrong and tries to just use

extern sym;

for both purposes and that simply does not work. [In particular
the statement

extern int x;

has unspecified meaning: you can't tell if it's a declaration or definition,
so C's hackery doesn't even work in C]

Barry Schwartz

unread,
Dec 13, 2014, 12:06:44 AM12/13/14
to ats-lan...@googlegroups.com
john skaller <ska...@users.sourceforge.net> skribis:
> But its WORSE. In both C and C++ you ALSO have to say something like:
>
>
> #if !defined(FLX_STATIC_LINK) && FLX_WIN32
> #define FLX_EXPORT __declspec(dllexport)
> #define FLX_IMPORT __declspec(dllimport)
> #else
> #define FLX_EXPORT
> #define FLX_IMPORT
> #endif
>
>
> #ifdef BUILD_RTL
> #define RTL_EXTERN FLX_EXPORT
> #else
> #define RTL_EXTERN FLX_IMPORT
> #endif

etc.

I asked about this a few weeks ago. Yes, it can be done. It is not
done by using ="ext:" but by a more convoluted but otherwise better
method; I am not the one to discuss it, however.

Brandon Barker

unread,
Dec 13, 2014, 9:13:34 AM12/13/14
to ats-lan...@googlegroups.com
Sorry I misunderstood the issue, I have not done much C++ (though I would be relatively happy to do). Barry, I also seem to have missed or not remembered your post on the issue =/

gmhwxi

unread,
Dec 13, 2014, 9:37:06 AM12/13/14
to ats-lan...@googlegroups.com
If you search for 'convoluted', you will find the post.
Reply all
Reply to author
Forward
0 new messages