bool

John Skaller2

unread,

Mar 10, 2020, 12:38:07 PM3/10/20

to felix google

I’ve been putting this one off too long.

Felix has two boolean types: bool and cbool. cbool is just C++ bool, which is
typically one byte. If you want to store a boolean value used as a component
in a cstruct, you have to use cbool because bool is 64 bits. bool and cbool are
inter-convertable with explicit constructors.

At present bool is the native meaning comparisons all generate bool not cbool.
The definitions are:

typedef bool = 2;
type cbool = "bool" requires index TYPE_cbool;

It’s worth notion that

//$ Short cut and via closure
noinline fun andthen (x: bool, y:1->bool) : bool =>
if x then #y else false
;

//$ Short cut and via closure
noinline fun orelse (x: bool, y:1->bool) : bool =>
if x then true else #y
;

These are expensive! Although

fun land: bool * bool -> bool = "$1&&$2"; // x and y

devolves to C shortcut operator, Felix can and does unravel function arguments,
which can defeat the lazy evaluation.

Now you may ask, and this is the point of this article, why 64 bits?

The answer is that bool = 2 is just one of a family of special types called
compact linear types. Products of compact linear types are compacted
into a single 64 bit unsigned integer. For example

2 * 3

occupies only the values 0..5. In particular it uses the standard representation
for variable radix natural numbers so that

true, false, true

is actually using just 3 bits. So an array of 64 bools uses a single 64 bit
machine word.

Surprisingly at first, the components of a compact linear product are
addressable. For example

var x = true, false, true;
px = &x.1;
assert *px == false;
px <- true;
assert x == true, true, true;

Compact linear pointers use three machine words: the base pointer of the
64 bit word, an integer to divide by, and an integer to find the remainder
with respect to. For powers of 2, these are just right shift and mask
operations, respectively.

Now, the main reason for compact linear types is to provide polyadic arrays.
If you have an array:

var x : double ^ (2 * 3);

then you can index it with a value like (1,2), so it’s a matrix, but you can
also coerce it to the type

double ^ 2 ^ 3

and index it like x . 2 . 1 (note reversed order!) OR you can coerce it to
a integer index

double ^ 6

and put it in a SINGLE loop to scan all elements. ALL array indices in Felix
are compact linear and so ALL arrays, irrespective of index type, can be
coerced to a linear array, allowing all the elements to be scanned
in a single loop. The reason is that arrays on a 64 bit machine simply
cannot be larger than 2^64 elements, not because your computer doesn’t
have enough memory .. but because it doesn’t have enough address space
to do anything larger. So the restriction to 64 bits is not a restriction in practice
if you’re doing arrays with compact linear index types.

Of course if the compact linear type is ITSELF an array, then even the
smallest value type, a bool, leads to a restriction of only 64 elements,
and less if the types are bigger.

Given a CONCRETE type, that is, a monomorphic type, Felix figures out
when it can use compact linear types automatically. This could lead
to a surprise that type 2^128 doesn’t work! You have to use cbool ^ 128.

The restriction could be removed, but it’s tricky for say base 3.

However the REAL problems start with polymorphism:

2 * T

You don’t know if this is compact linear or not until T is replaced.
If T = int, its not compact linear, if T = bool it is.

Now 2 * T is a tuple type but it is NOT an array, even if T is eventually
replaced by type 2. This aint C++ crap, Felix respects parametric polymorphism.
What this means is you’re only allowed to do things to polymorphic types
that work for ALL possible replacements of T, and parametricity says,
that can be ANY type at all.

Now the magic now says, well, we don’t care, we can still do pointer
operations on any product, since compact linear types have pointers.

The problem is the pointers are different types! Here’s the crux of the problem:

For ordinary types, a pointer projections are associatiive, which means,
a pointer to a type T nested in a more complex type, is just a pointer to T.
If a T is a product, we can point to one of its components and that
pointer is stand alone, we don’t care how we got into there.

This is NOT true for compact linear pointers, because they have to point
first to a 64 bit machine word, and then we use math operations to extract
or set component values. All these operations compose just fine for
monomorphic types because we know when we hit the word boundary
to switch pointer models.

But we do NOT know with poiymorphic types.

The problem is that ordinary products

A * B

and compact linear products use the same notation. The interpretation
for ordinary products is “Addressable byte sequence with padding for
alignment” whereas for compact linear products it is all math operations
on a fixed size integer. So when you have a a deeply nested product
with a type variable deep:

A * (B * ( C * (D * T)))

you really don’t know how to address ANY of the components until after
T is replaced. If A,B,C,D are compact linear, the addressing operations
depends on whether T is or not. But polymorphism says that isn’t allowed.

What it comes down to is we don’t know if the * is an ordinary product
or a compact linear product.

Now in theory, the solution is just laziness. The problem is a polymorphic
pointer now needs to keep the CONTEXT of the addressable component
around until we know where the compact linear type boundary is.
We will know after monomorphisation, but the problem is that polymorphically
we have to keep a whole list representing a projection chain. In other
words, a pointer to the T in the above type is not just &T, but actually

&(A * .,…) . .1.1.1

We can’t just write &T because if the type is compact linear, we have lost
the location of the 64 bit machine word.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 10, 2020, 8:45:52 PM3/10/20

to felix google

So, the polymorphism issue is actually solved in part: a type

2 * T

where T:TYPE is not compact linear. If T:COMPACTLINEAR it is.
Thus, the type of product is known, and so the type of pointer as well.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 12, 2020, 10:33:44 PM3/12/20

to felix google

I think I have solved this! Finally after several years.

In order to extract a component from a compact linear type, you just do this:

value / divisor % modulus

For all powers of two, this amounts to a right shift and a mask. But the problem was
that if the compact linear type contains polymorphic components, you cannot
always calculate the divisor or modulus, since the size of a type variable is not known.

The solution is to change the representation of a compact linear pointer and
a projection thereof to just record a list of component indices which navigate
down to the selected component, and find the divisor and modulus in the back end
after the type is monomorphised.

Duh. Debugging now.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 13, 2020, 12:16:23 AM3/13/20

to felix google

Here’s the demo: this FAILS:

fun swp[X,Y] (a: X * Y * X) {
var b = a;
var tmp = b.0;
&b.0 <- b.2;
&b.2 <- tmp;
return b;
}

var x = `2:6, `3:7, `4:6;
println$ "x=" + x._strr;
var y = swp x;
println$ "y=" + y._strr;

/Users/skaller/.felix/cache/text/Users/skaller/felix/x.cpp:49:7: fatal error:
indirection requires pointer operand ('::flx::rtl::clptr_t' invalid)
*::flx::rtl::clptr_t(( ::flx::rtl::cl_t*)&b, 42, 6)=b/1%6; // storeat
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

But using this works:

fun swp[X:COMPACTLINEAR,Y:COMPACTLINEAR] (a: X * Y * X) {
var b = a;
var tmp = b.0;
&b.0 <- b.2;
&b.2 <- tmp;
return b;
}

~/felix>flx x
x=(case 2 of 6,case 3 of 7,case 4 of 6)
y=(case 4 of 6,case 3 of 7,case 2 of 6)

So there is a problem, we should either get a compiler time error for the bad case,
or it should actually work. This is possible by unpacking the compact linear value
to an ordinary tuple. Unfortunately, there’s no kind “NOTCOMPACTLINEAR” :-)

AT present not everything is optimised. Basically the code generator is letting
some optimisable cases go through, and letting C++ do the heavy lifting.
It’s unlikely it will do a good job, but who knows.

The Felix compiler representation should work on compact linear
tuples including arrays BUT it ONLY supports constant projections
at the moment. Arrays should support arbitrary expressions.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 13, 2020, 2:46:05 AM3/13/20

to felix google

So now the nastiness. There are two ways to handle compact linear types.
The first is to consider compact products as a separate product type.
This means different type of pointers as well. General pointers are
actually a subtype of compact ones.

The second way is to use a common abstract representation, and only
do the compaction as a representation choice in the back end.

A property of ordinary products is that all are isomorphic
to a flattened product. In particular, a subrange of components
is layout compatible with part of the whole range. In other words in

struct X { int a; float b; double c; };

for example, given

struct S { int a; float b; };

then given

union XS { X x; S s; };

it’s perfectly safe (if not Standards compliant) to store in x and fetch from s.
This applies to any subrange. It also applies if a member of a struct is a struct,
then the whole stuct is isomorphic and layout compatible to a flattened one:

struct A { int a; X x; S s; };

is isomorphic to

struct A’ { int a; int a’; float b’; double c’; int a’’; float b’’; };

Strangely, compact products have the same property. However
MIXED products screw things up, because a product (at the moment)
is compact linear if and only if all the components are compact linear.
So for example in Felix notation

int * (5 * 7) * double

is not layout compatible with

int * 5 * 7 * double

because the nested second component of the first type is a single machine word,
but the second and third components of the second type are EACH single
machine words.

The result extends to pointers: a pointer to the 7 in the second type is a machine
pointer, to the 7 in the first type is compact linear (a machine pointer to the
5 * 7 plus a divisor of 1 and a modulus of 7).

In other words, the isomorphism is no longer an identity.

=====

Another issue is that a general representation of a pointer:

base_type + list of projections

handles all cases BUT a now given

var x : int * double;
var y: int;

the pointers &x.0 and &y are no longer the same type, because in that representation
the “whole” entity is at the top followed by projections which pick the component type
targetted, but the whole type is part of the pointer type.

This is CORRECT?? for compact linear pointers, because to use them you have
to know the whole type, and which component thereof you want. You cannot get
the component from just the machine address, and two different whole types
are distinctly typed pointers. In particular the divisor depends on knowing not
only which component you want, but also the type of the tail components after
the selected one (since the size of the tail is the divisor).

=====

Hmmm.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 13, 2020, 9:44:38 AM3/13/20

to felix google

I think it has to go like this:

A value is manifestly compact linear if it is a compact linear constant
such as `2:5, or its a type variable of kind COMPACTLINEAR.
A product is compact linear if all its components are compact linear.
I’m leaving out sum type temporarily.

A compact linear product has to have a distinct constructor and type internally.
Either we have

BTYP_cltproduct of type list

for example, or, a generic compaction operator:

BTYP_clt or (type)

The advantage of the generic operator is that it’s a single new term
which can be detected with a single pattern match and dispatch to code
that assumes all the contained terms are CLT.

Expression terms don’t need to change because they’re typed.

Now, the point of doing this is that terms/types are partitioned into
CLT and non-CLT during binding ONLY. There’s no need for subsequent
code to examine a product to see if its clt.

The PROBLEM the demo shows is that

T * U

is not CLT polymorphically but if T and U happen to be CLT then the product
was too, under the old rules. Problem is the projections on pointers might have
been ordinary pointers, so we cannot just magically change the term from
non-CLT to CLT. In other words this is CLT:

fun swp[X:COMPACTLINEAR,Y:COMPACTLINEAR] (a: X * Y * X) {
var b = a;
var tmp = b.0;
&b.0 <- b.2;
&b.2 <- tmp;
return b;
}

and this is NOT:

fun swp[X,Y] (a: X * Y * X) {
var b = a;
var tmp = b.0;
&b.0 <- b.2;
&b.2 <- tmp;
return b;
}

no matter what X and Y are bound to in an instance.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 14, 2020, 8:52:33 AM3/14/20

to felix google

I’m still not happy. This should work:

fun swp[X,Y] (a: X * Y * X) {
var b = a;
var tmp = b.0;
&b.0 <- b.2;
&b.2 <- tmp;
return b;
}

whether or not X,Y are compact linear. CL should be regarded as an optimisation.

So: a pointer can only arise three ways:

(a) Address a variable
(b) Heap allocation (new or some C binding like malloc)
(c) Pointer projection

What is done is that a pointer type at the moment has a general form taking three arguments:

1. mode (RW etc)
2. target (codomain) type
3.. base (domain) type

Forms (a) and (b) notionally set (2) = (3). Projections of CLT change (3) but not (2).
Projections of non-CLT change both (2) and (3). [The actually representation of
(2) is a list, which if empty means equal to the target type. The reason is pattern matching
is easier if we don’t have to check for equality to determine if something is CLT or not).

A pointer value is a reference to a variable, or any other expression of pointer type.
A projection cannot be applied to produce a new pointer expression, the application
must remain. In other words, only the back end can do the application. Similarly,
projections cannot be composed except by the back end (and they have to be applied
at the moment although we do have closures for them so that in theory they
don’t have to be, but the closure is just a lazy application to a parameter anyhow).

For a CLT, a CLT pointer is a pointer plus a sequence of integers representing
the applied projections. Integers are enough for CLT products because only tuples
can be CLT. Integer constants are NOT enough if the tuple is an array.

CLT projections in this form can both be applied to pointers, both ordinary
pointers and CLT pointers, by simply appending the projection index lists,
similarly projections can be composed. But only because the projections
are constants.

So here’s the thing. At present if a polymorphic tuple with an ordinary pointer
into it turns out to be CLT, the type of the pointer will be wrong and can’t be
“fixed” because the base domain type has been lost. Note that the actual
pointer expression RETAINS the information! Because, applications of
projections to ordinary pointers cannot be reduced in Felix, only in C++
when the representation is known.

Now, we could treat all pointer types “like CLTs”. The problem is that
unification would fail when you expect it to work. In particular as already
illustrated a pointer to an int variable and a pointer to a struct variable
to which a projection is applied to get a pointer to a member of the
struct are both pointers to int at the moment, but NOT if we retain
the base type. Then the base of the first is int, and the second is
the struct.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 15, 2020, 2:49:10 AM3/15/20

to felix google

Grr. So my next attempt is to add 4 new type constructors:

BTYP_compacttuple
BTYP_compactsum
BTYP_compactarray
BTYP_compactrptsum

These define, in order compact products, compact sums, compact arrays,
and compact coarrays. The arguments must be compact. These existing
types are also compact:

BTYP_void
BTYP_unitsum
BTYP_tuple []

This is a lot more code required than just BTYP_compact but its easier to work
with if laborious. All these types use a 64 bit machine word representation
at the moment (except BTYP_void which has no representation).

Note the invariants: BTYP_tuple must have 0 or 2+ arguments in its list.
BTYP_compactsum and BTYP_compacttuple must have 2+ arguments,
as does BTYP_sum.

BTYP_unitsum argument must be 2 or greater.
BTYP_array, BTYP_compactarray, BTYP_rptsum and BTYP_compactrptsum
must have an index “greater than 1”. This is hard to define!

The idea is arrays (and coarrays) must have at least two elements.
An array of one element V is identical to V. An array of no elements
is identical to unit (and therefore the array base type is lost).
A coarray of 1 value V is identical to V. A coarray of no values
is void (and the base type is lost).

There should be some more: pointers to compact types are distinct from
pointers to non-compact ones, and this is the whole point of the new terms.

A non compact data type never becomes compact, even if it has compact
components. For example the idea is:

fun swap[T,U] (a:T,b:U) => b,a;

The argument and result are not compact, even if T,U are instantiated with bool=2.

The basic idea is you have to specify if you want a value to be compact.
Exactly how remains to be decided :-)

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 15, 2020, 8:01:05 AM3/15/20

to felix google

> On 15 Mar 2020, at 17:48, John Skaller2 <ska...@internode.on.net> wrote:
>
> Grr. So my next attempt is to add 4 new type constructors:
>
> BTYP_compacttuple
> BTYP_compactsum
> BTYP_compactarray

> BTYP_compactrptsum=

This is looking better, but raises some issues of notation and a few other details.

The first is notation. One way is this:

compact (T * U)

for a compact type. However there’s a problem: arrays. At present the index type
of an array HAS to be compact. But this is ugly:

B ^ compact (T * U)

I’ve a thought to be a tad inconsistent and just allow

B ^ (T * U)

to mean that. Note the *array* is not compact unless you write:

compact (B ^ (T * U))

In particular

2 ^ 50

is an ordinary array of bool, whereas

compact (2 ^ 50)

is a single machine word with 50 bits.

For values,

`1:5, `2:5, `3:7

is an ordinary tuple, you need again:

compact (1`:5, `2:5,`4:7)

for a compact tuple and of course

compact (`1:5, `2:5, `4:5)

is a compact array.

An alternative is to use compacting operators: I think these are available:

\* compact product
\+ compact sum type

however \^ is available as a type, but might be confusing. for values

\,

might be OK.

========

There are some nasties. In the compiler if I map a compact product with some function f,
and I mean an Ocaml function translating types to types, the resulting product may
not be compact. So the question is what the constructor function

btyp_compactproduct ts

does when all the ts in the list aren’t compact.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 15, 2020, 5:41:17 PM3/15/20

to felix google

So this is the first test case, now extended. It’s a pain, because all the tests assumed
that types made of compact linear types were compact linear. This is NO LONG THE CASE.

With the new system you have to be EXPLICIT that you want compaction.

typedef d_t = 2 * 3 * 4;
var x : d_t = `1:2,`2:3,`3:4;
println$ x . 0 . _strr;
println$ x . 1 . _strr;
println$ x . 2 . _strr;

typedef cd_t = 2 \* 3 \* 4;
var y : cd_t = `1:2\,`2:3\,`3:4;
println$ y . 0 . _strr;
println$ y . 1 . _strr;
println$ y . 2 . _strr;

I used \* for compact tuple type and \, for a compact tuple value.
Not sure I like it. Later I will provide a conversion which packs non-compact
values and types (and one that unpacks them). I used “pack” and “unpack”
already defining unique strings so might call them compact and uncompact.
Also I am thinking of this:

(< 5 * 7>) —> 5 \* 7

It’s a real pain not having <..> for parens. Similarly we have to use ([..]) for lists.
The real nasty is

2^(5*7)

will now be illegal. It has to be

2^(5 \* 7)

because array indices MUST be compact linear (at the moment).

Anyhow, the important (and interesting!) thing is that the compact and non-compact
versions of most test cases will be put together in each test and we expect, generally,
the same result from both.

The most demanding test is polyadic arrays. The notation for a compact array
remains undecided but I have to decide soon! Consider:

2 ^ 57

This is an array of bool, but it is not compact. We would use

2 \^ 57

for a compact array, an array of 57 bits occupying a 64 bit machine word.
The first type uses 57 machine words because at the moment all compact
linear types use 64 bits. This is 8 times worse than C++, but the compact
version is 8 times better (57/8 rounded up is 8). One day, two extensions will
be required:

(a) Allow 8,16, 32 and 64 bit representations, pick one that fits.

Possibly also 128 and 256 since Felix has them.

(b) Completely remove the length restriction.

This has to be done by splitting a compact product into machine words.
It will not be completely compact. When the size of a product is greater
than 2^64, we take components until one doesn’t fit, that one goes in
the next machine word. If a single component won’t fit, we have
another problem :-)

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 16, 2020, 7:19:59 AM3/16/20

to felix google

Well interesting nasty:

typedef array[t,n:COMPACTLINEAR] = t ^ n;

It’s essential since I put i a check that the index type of an array
is compact linear, an arbitrary N:TYPE index will no longer do.

I fixed them but I still get this:

Binding library std
[bind_dir] Type binding failed for array[T, N]

Note ALL such N values must be COMPACTLINEAR.

For example:

//$ Array as value.
instance[t,n:COMPACTLINEAR] ArrayValue[array[t,n], t] {
fun len (x:array[t, n]): size => Typing::arrayindexcount[n];
fun unsafe_get (var a: array[t, n], j: size): t => a . (j :>> n);
}

// this one should
proc unsafe_set[t,n:COMPACTLINEAR] (a: &(t^n), i:size, v:t) { a . (i.int) <- v; }

open[T,N:COMPACTLINEAR] Eq[array[T,N]];

Argg. But the BAD NEWS is that NONE of the functionality in class Farray
can work for compact linear arrays t \^ n.

Because its a completely different type.

Some cut and paste polymorphism is required.
Arggghhhh!!!!!

But its CORRECT. Finally. Its right. It has to be. Because pointers
into compact linear arrays aren’t machine pointers.

The common parts should be expression with one lot of code though,
and I think it can be using HIGHER ORDER POLYMORPHISM.

In particular the exponential operator has to be passed to the class as
a functor. Similar to Monad.

The problem now is how to “unbreak” the standard library build.

BTW: A HACK! A sum of units and a compact sum of units are identified.
They both produce a unitsum.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 16, 2020, 9:14:32 AM3/16/20

to felix google

[bind_dir] Type binding failed for array[T, N]

Try to bind dir Eq[array[T, N]]
vs = [T: TYPE, N: COMPACTLINEAR]
Original exception: Failure("Array index must be compact linear, got BTYP_type_var(3942:TYPE)")
SYSTEM FAILURE
Array index must be compact linear, got BTYP_type_var(3942:TYPE)

Well, a good diagnostic test for a bug. Clear, the type variable didn’t inherit
the correct kind.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 17, 2020, 7:46:35 PM3/17/20

to felix google

Ok, we’re down to one test failure, and one known check failure.

The test failure is interesting and telling:

//TYPE 86259: unit^2
struct _at86259 {
static ::std::size_t const len = 2;
typedef ::flx::rtl::cl_t element_type;
::flx::rtl::cl_t data[2];
_at86259() {}
_at86259() {
}
};

This is an array of two units. The second constructor
would normally take two arguments, but they’re unit, so they’re elided,
leading to a duplicate of the first constructor.

In Felix there are *incomplete* measures to eliminate unit values.
A unit value () is a worthless value. Its a compact linear type with
1 possible value: zero.

Pointers to unit exist, and are
represented by NULL (nulptr), however they cannot be dereference
or used to store a value.

Unfortunately there’s a hard care that I have not dealt with properly.
If a function has a tuple argument, the parameter is hard to eliminate,
because if a tuple contains a unit, eliminating it changes the number
of components and indexing scheme. It would require remapping
all projections using integer indices if done early, so the idea is to
allow these things in the virtual machine, but eliminate them in the
code generator. There’s an “emergency” unit value supported in
the code generator, a unit value can still be generated as you can
see in the example.

So what we have above is that a unit array is being made,
given the emergency type, and then, safely, we don’t bother
to initialise it because it can’t be used.

The type is used:

//TYPE 86254: (unit * list) * list
// typedef _tt2<_tt86253,void*> _tt86254;
struct _tt86254;

//TYPE 86255: (unit * list) * list -> list
// typedef _ft<_tt86254,void*> _ft86255;
struct _ft86255;

//TYPE 86256: (unit * list) * list -> unit * list
// typedef _ft<_tt86254,_tt86253> _ft86256;
struct _ft86256;

//TYPE 86257: unit * list -> list
// typedef _ft<_tt86253,void*> _ft86257;
struct _ft86257;

//TYPE 86259: unit^2
// typedef _at< ::flx::rtl::cl_t,2> _at86259;
struct _at86259;

//TYPE 86260: unit^2 * list
// typedef _tt2<_at86259,void*> _tt86260;
struct _tt86260;

//TYPE 86261: unit^2 * list -> list
// typedef _ft<_tt86260,void*> _ft86261;
struct _ft86261;

/*begin match*/
/*match case 1:|Returns (r2_param_arithmetic_64371, toks2_param_arithmetic_64372)*/
if(!andthen(FLX_VI(_lam_64502_mv_64508)==0,(FLX_NEWP(_fI78852__lam_64511)(ptf, this))) ) goto _ml64510_L78847;
toks2_param_arithmetic_64372 = (*((_tt86246*)FLX_VP(_lam_64502_mv_64508))).mem_1; //init
_lam_64515_mv_64517 = FLX_VR(0, new(*ptf->gcp, _tt86260_ptr_map, true) _tt86260 (_tt86260(_at86259(), toks2_param_arithmetic_64372))); //init
goto _end_inline__lam_64502_78854_L78854;
_ml64510_L78847:;

So actually, lists of unit etc might exist too. A list of units, is NOT USELESS .. it has a length
which can be inspected.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 17, 2020, 8:41:20 PM3/17/20

to felix google

The issue I refered to is this: there is a hard check that the index type of
an array is compact linear. You can’t construct one that is otherwise.

But the constructor has no source information. Location tracking is harder
and more complex that the base algorithms used in the compiler, and is left
out of certain parts, including the type system. Exceptions usually propagate
up to levels that have source information. But the checks in the type system
core are immediate termination asserts because the code calling the
constructors are supposed to safisfy the invariants, doing the checking
in a context that supports better reporting.

When I was implementing the new model I found a lot of cases
in the Felix library like these

typedef array[T,N] = T^N;
fun stl_begin[T,N] (x:array[T,N]) => ...

which are wrong and had to be changed to this:

typedef array[T,N:COMPACTLINEAR] = T^N;
fun stl_begin[T,N:UNITSUM] (x:array[T,N]) => …

Unlike almost any other production programming language, Felix has an
explicit kinding system which is now an intrinsic and essential part of
the language. [Haskell has kinds, but they’re implicit and derived by
kind inference, and they’re also weak and in some cases wrong]
In fact, the ONLY other production language I know of with kinding
is actually C++.

There are two sorts of kinds: extra kinds and type related kinds.
The terminology is a bit confusing.

One extra kind is BOOL, which has values TRUE and FALSE.
BOOL is a compile time boolean. There’s also INT.
It’s interesting C++ has a superior mechanism in some senses:
constexpr make ALL types available at compile time.

The other kinds are type related, values of these kinds are types.
TYPE is the kind of shareable types, LINEAR is a superkind
that includes unique types, COMPACTLINEAR is unrelated
and is a subkind of TYPE that contains all the compact linear
types. UNITSUM is a subkind of COMPACTLINEAR that
contains the types 0,1,2,3 .. etc which are unit sums.

The reason we need these kinds is to place constraints
on type variables, and, more generally, type expressions
that are not constant. For example to join two arrays:

fun join[T, N:UNITSUM, M:UNITSUM] (x:array[T, N]) (y:array[T, M]):array[T, N `+ M] = {

This says, two arrays with unitsum sizes, that is, like 5 and 12,
we end up with an array of size 17. It probably doesn’t work,
because it uses a curried form so M cannot be deduced from
the first argument. The important thing is the formula for the
length: N `+ M which is the compile time sum of the argument
array lengths. Felix knows that is a unitsum, but the argument
types N and M must be for the calculation to go through.

There is even this one:

fun subarray[
FIRST:UNITSUM,
LEN:UNITSUM,
T,
N:UNITSUM,
K:UNITSUM=_unitsum_min(LEN, N `- FIRST)
]
(a:T^N) : T ^ K

This is a compile time version of “substring” operation
for arrays rather than strings.

Now I’m explaining all this to show how important the Felix kinding
system has become. Felix not only has to type check your code,
it has to kind check it as well.

And that’s the BUG. It’s not kind checking instantiations, at least of typedefs.
A user error gets caught, but too late for a useful diagnostic.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 18, 2020, 9:12:31 AM3/18/20

to felix google

Fixed issue:

CLIENT ERROR
Kinding error binding type td[int]
Argument type int has kind TYPE
which is not a subkind of required kind UNITSUM
In /Users/skaller/felix/xx.flx: line 3, cols 1 to 19
2:
3: var x : td [int];
*******************
4: C_hack::ignore(x);

See also /Users/skaller/felix/xx.flx: line 1, cols 1 to 31
1: typedef td[N:UNITSUM] = int^N;
*******************************
2:

NOTE: the fix ONLY catches type instantiations.

In particular it won’t catch the error if I write

typedef td[N] = int ^N;

The issue is caught but very bad diagnostic:

BINDING typedef td<64365> = int^N FAILED with Flx_exceptions.ClientError(_, "Flx_bind_type.TYP_array] Array index must be compact linear, got <T64366>")
primary table post-construction: Missing symbol 64365 processing symbol _init_<64370>
Post construction, symbol 64365 used in 64370 missing from bound symbol table

And this program works:

typedef td[N] = int^N;
//var x : td [2];
//C_hack::ignore(x);
println$ "Hello”;

even though it does report the binding error.
Note the program FAILS if I uncomment those two lines, even though the instantiation
is correct.

None of this is very nice. The binding error should be fatal. The problem is the detection
occurs during tentative binding when errors are tolerated. If the typedef isn’t actually
used it will be not be permanantly bound. This due to a compiler optimisation to speed
up compilation I think. Not sure. Well at least I caught one kind of bug nicely!

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Mar 18, 2020, 9:35:44 AM3/18/20

to felix google

Oops. I added a fix for arithmetic test which identified arrays of unit with a single unit.
Unfortunately this now fails:

[Flx_lookup.bind_expression] Inner bind expression failed binding (eq ((), patterns-tuple-01_mv_64364<64364>))
CLIENT ERROR
[flx_bind/flx_lookup.ml:2973: E149] [lookup_name_with_sig] Can't find eq[] of BTYP_tuple()=
eq[] of unit
In /Users/skaller/felix/build/release/test/regress/rt/patterns-tuple-01.flx: line 3, cols 5 to 7
2: match () with
3: | () => "()"
***
4: endmatch;

because eq of (unit * unit) is the same as eq of unit^2 which is now the same
as eq of unit since unit^2=unit.

I’ll have to undo the identity (the fix to arithmetic should still work).

—
John Skaller
ska...@internode.on.net

Reply all

Reply to author

Forward