kinding and stuff

13 views
Skip to first unread message

John Skaller2

unread,
Oct 23, 2018, 11:08:50 PM10/23/18
to felix google
So I have just upgraded the compiler, making it more robust and simpler,
and enforcing the kinding system more rigorously.

Some issues arise.

OVERLOADING
=============

One is that when you define a type function:

typedef f = fun (x:TYPE):TYPE => ….


the symbol f is a NonFunction symbol. This means there can only be one of them
in any scope. So you can define

typedef fun <= (T:TYPE):TYPE= ….

which defines <= but it doesn’t overload with <= defined for expressions. That’s
because “typedef fun f” means the same as “typedef f = fun …”, i.e. it works similar
to

var f = fun (x:int) => …

No overloading. For fixed symbols like “<“, “<=“ etc we can fix this by mapping
them in the type grammar. So you define

typedef type_le = …

and the parser maps

T1 <= T2

to

type_le (T1, T2)

But that’s not really very satisfactory.

GENERIC TYPE MATCHES
======================

These are implemented. If you write

typematch x with … => TRUE | … => FALSE endmatch

for example, the compiler knows the match is kind

TYPE -> BOOL

Empty matches aren’t allowed. The compiler uses unification to find
the greatest kind of which the kinds of each branch are a subkind.

UNIT
====

Felix can kind type tuples. For example

int, long

has kind

TYPE * TYPE

and of course

int : TYPE

But what about

()

This is an empty type tuple. Its very important. There are no arrays but

TYPE ^ 0 = UNIT
TYPE ^ 1 = TYPE
TYPE ^ 2 = TYPE * TYPE

etc would make sense. Recall TYPE is the category of types and functiuons.
The UNIT category is extremely important because without it, there’s no
way to give a kind to a constant type function:

typedef boool = fun () => bool;

This should have kind

UNIT -> TYPE

but we cannot say that at the moment. There’s an issue here:

_typeop (“_staticbool_and”, T, BOOL)

because this function is a chain function: it “and” together any number of expressions,
including none! The argument T has to be a type tuple of types of kind BOOL.
Of course if T = () we want the function to return TRUE.

In this case, staticbool’s and operator knows an empty type tuple must be
interpreted as

BOOL ^ 0

so the question arises how to handle UNIT: should there be one UNIT, or
should BOOL ^ 0 and TYPE ^ 0 be distinct?

Its the same kind of question you’d ask if a type match were empty (no branches).
In that case it would be

TYPE -> VOID

Two nasty things happen here: I need to get the algebra right, and, to type the
result properly … WE NEED SORTS ON TOP OF THE KINDS.

To do generic kinds, we have to have KIND VARIABLES.

There is much confusion here. I am focussed more on BOOL and using
it for constraints, and on UNITSUM for polyadic linear arrays. Even the array
stuff is problematic because it ONLY works for linear arrays.

One has to remember that

((1,2,3),(4,5)) : int ^ 3 * int ^ 2

can be linearised with a type coercion to type int ^ 5. We note the following fact,
the type can also be written:

int ^ (3 + 2)

This is hard to grok, but it means to index the array, you first pick which subarray
to acces, the first or second one, then give the index for that one.

So what’s the problem?????

HERE:

Static assert failed
In /Users/skaller/felix/x.flx: line 10, cols 1 to 50
9:
10: static-assert type_eq (int^3 * int^2, int^(3+2));
**************************************************
11:

The problem is clear: the compiler doesn’t know how to use all the index laws.
This fails too:

static-assert type_eq (3 *+ int, int + int + int);

But this works:

static-assert type_eq (int^3, int * int * int);

The reason is CANONICAL FORMS. When the compiler sees the RHS above it
is translated to the LHS, so they test equal because they’re identical type terms.

But this isn’t done for everything.

Now here is the KEY THING!!!!

Programming languages are CHARACTERISED by the choice of which
isomorphisms are represented as identities.

A trivial example: in Felix int ^ 1 = int. There are no tuples of one element.
In other languages, like Python, there are tuples of one element:

(42,)

is how you write a tuple of one element in Python. This has advantages and
disadvantages. For example suppose we had a kind TUPLE, then

int * long : TUPLE
1 : TUPLE
int : ??

You see here, int isn’t a tuple. even though int ^ 1 is equal to int.
And here’s the hassle:

( (1,2), 3) : (int * int) * int

this is a tuple of two values, the first is a tuple also, the second isnt.
So what about:

T * U

Now we don’t know if T is a TUPLE or not!

If we had tuples of one element the type system would be a lot more uniform.
The PROBLEM with that is that you then need two coercions:

untuple (1,) -> 1
entuple 1 -> (1,)

Entuple works on any value, but untuple requires a tuple of length 1.
What’s wrong with that?

There’s nothing *wrong* with it, but now these functions are distinct:

fun f[T] (x:T) => …
fun f[T] (x: T ^ 1) => …

It’s worse, if you make arrays distinct for tuples .. then you need to convert

int * int * int <—> int ^ 3

These type are isomorphic but not identical if you do this. At present in Felix,
a tuple of elements of the same type is an array. There are no arrays of length 1.
An array of length 0 is identical to (). In particular, the element type of the array
is lost.

On the other hand, this isomorphism:

2 * int === int + int

is NOT provided, instead we have to use:

2 +* int

the “repeated sum”. The reason is that the “standard” representation of sums
is indeed

struct _uctor_ { uint64_t tag; void *data; };

and I should have used:

template<class T> … T data …

and then the layout would be the same and I could make the types identical.
But I can’t. And sum types have other efficient representation for example

variant opt[T] = None | Some of T;

is literally a NULL pointer for the first case and a pointer to T for the second one.
Other variants have other compact representations.

In fact the core one is OF COURSE compact linear types!

Because of these choices, the isomorphisms cannot be made into
identities. Of course

A * B === B * A

but that isomorphism is not one we want to be an identity!



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Oct 23, 2018, 11:36:25 PM10/23/18
to felix google
>
> The problem is clear: the compiler doesn’t know how to use all the index laws.
> This fails too:
>
> static-assert type_eq (3 *+ int, int + int + int);

I have no idea why because, checking, the canonical form IS enforced!
Debugs show the types are indeed identifcal.

The check above is done by type matches in the library, not using
type ops.



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Oct 24, 2018, 12:01:14 AM10/24/18
to felix google


> On 24 Oct 2018, at 14:36, John Skaller2 <ska...@internode.on.net> wrote:
>
>>
>> The problem is clear: the compiler doesn’t know how to use all the index laws.
>> This fails too:
>>
>> static-assert type_eq (3 *+ int, int + int + int);
>
> I have no idea why because, checking, the canonical form IS enforced!
> Debugs show the types are indeed identifcal.

Ah, no they don’t… trivial stupidity: forgot to put *+ operator in the
type grammar. Not sure why it parsed!


John Skaller
ska...@internode.on.net





John Skaller2

unread,
Oct 25, 2018, 10:25:58 PM10/25/18
to felix google
RATIONALE
==========

A reduction is a rule like tthe one here:

class FloatAddgrp[t] {
inherit Eq[t];
virtual fun neg : t -> t;

reduce inv(x:t): - (-x) => x;
axiom sym (x:t,y:t): x+y == y+x;
}

Its a special kind of axiom which is not checked, rather it is tells the compiler
to replace one expression by another. In this case it says

if you see an expression doubly negated, remove both negations

I removed processing of reductions some time ago but now I want to put them back.
Vital optimisations can be expressed with reductions which are hard to build into
the compiler with generality, or, indeed, at all!

A trivial analogue of the above:

rev (rev x) => x

reversing a list twice has no effect, so don’t! There are some VERY POWERFUL
optimisations which are less trivial. Here’s a FUSION reduction:

map f (map g x) ==> map (f \circ g) x

For example, instead of mapping g over a list, then mapping f
over the result, map the composite function f \circ g over the list.
Saving the construction and rescanning and deallocation of a temporary list.

What is even more interesting about this one is that

THE RULE APPLIES TO ALL FUNCTORS

Not just lists, trees, arrays, anything at all, as long as it is
a functor. In fact

THIS IS THE DEFINITION OF A FUNCTOR

The point is, map preserves the structure of the original data type, changing
only the values of the elements. And I might add this is one of the core
optimisations in Haskell. But it is not the only fusion optimisation.

At present, Felix has no way to say: if this is a functor, maps over it can
be fused: but we CAN say for a particular functor like list to do the fusion,
and then again for array.

PROBLEMS
=========

Applying reduction is EXTREMELY EXPENSIVE. What Felix does is take
some code and match the LHS of EVERY reduction rule to EVERY expression,
and that includes EVERY SUBEXPRESSION OF EVERY EXPRESSION.
If a reduction is performed, a previous failure could now succeed so we might
do the matching again and keep going until there are no more reductions.

So it’s hellishly expensive. But that’s the least of the problems!
At present I’m working on a reduction pass that finds reductions
in your code AS YOU WROTE IT.

Its very unlikely it will find any. Who is going to write

rev (rev mylist)

??? No one. Where we profit from reductions is applying them
AFTER INLINING. For example its common processing lists to take
a list, do some processing on it, resulting in a backwards list, so
our function reverses it. Another function does the same, and another.
Then you chain them together. However a lot of these algorithms
work fine on reversed lists too, for example map. In fact rev_map
is faster than map, because map is literally defined as rev_map,
which is tail recursive, followed by rev.

So why does the programmer keep reversing the list?
The answer is: because its too dang hard to keep track of the order
of the list otherwise. You have to throw in a “rev” manually after
counting the number of operations if the number is odd.

The optimisation above, applied recursively AFTER inlining,
might do that for you. Although we need an axiom

rev (f x) = f (rev x)

to allow it.

But here’s the problem: inlining is now done AFTER monomorphisation,
and monomorphisation maps every instrance of a polymorphic function
or type that is actually used to a brand new function or type.

And the reduction pattern matching won’t recognise these patterns.
[Unless (a) they were already monomorphic or (b) they’re C bindings
which are not monomorpised]

It isn’t possible to instantiate reductions in classes because we don’t
know what instances are needed. Also there is a real problem in that
if we do somehow find a match after monomorphisation, and replace
some expression with another, where do the symbols in the new expression
come from??? We didn’t know that instance of a symbol was going to be used
when we monomorphised.

Felix USED to do inlining on polymorphic functions, before monomorphisation,
in fact Felix never did monomorphisation! It just tracked required instances
and the monomorphisation was done “on the fly” in the code generator.

But that meant inlining had to not only replace parameters with arguments
it had to specialise types as well and it got really hard to understand the code
at all. So I decided to momorphise first, THEN inline, so there were no type
variables or specialisations to worry about.

The good news is monomorphic reductions survive, as do polymorphic
ones applied to C bindings. So I can eliminate the others for post-monormophisation
provided I make sure the RHS symbols are retained by the symbol garbage collector.
After a pass, wipe all the reductions so the GC can clean out unused symbols.

I have no idea if all this will be effective though, remembering the cost of
checking if reduction rules actually apply.



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Oct 26, 2018, 11:26:20 AM10/26/18
to felix google
So .. monomorphic reductions are no first preserved then discarded.
The check is that the symbols are all available in the monmorphisation table.
I think I can proppagate them through to inlining by adding the reduction garbage
collection routine in the right place. Some monomorphic reductions will be lost
if their RHS contains unused symbols, even if applying the reduction would lead
to the symbol being used. In theory this situation can be improved but it may be useful.

And now I have a cool idea, though I don’t know for sure it will work.

Lets add a new nominal data type rope, which contains a list of strings.
It stands for the concatenation of the strings. Adding ropes, or prepending or
appending a string to a rope is fast compared to the equivalent operations
on strings.

But, other operations won’t work, so we have to convert ropes back t strings
to do them.

Now here is the FUN!!

Consider

reduce ropes (x:string,y:string): x + y => mkrope (x,y);
reduce ropes (x:rope, y:string): x + y => mkrope (x,y);
reduce ropes (x:string, y:rope): x + y => mkrope (x,y);

and three functions

fun mkrope(x:string,y:string) => …


Now, all string concatenation makes ropes and ropes add to strings.

So how do we get a string when we need one, for example:

search (s1, s2)

searches in s1 for s2. We don’t want to overload search so s1 and s1 can
be a string or a rope (that’s 4 functions already).

What we need is a C++ like operator conversion.

And we have one!!!!!!!

supertype string (x:rope) => …

This says, a rope is a subtype of a string, so a rope can be passed to a function
with a string parameter. When that happens, the function above is used as a coercion.

If this works its magical, because it makes the use of ropes transparent!

The main problem is that this is ambiguous:

fun f(x:string) => …
fun f(x:rope) => …

because Felix does not weight subtyping coercions worse than no coercions.

At least I think this is what happens … hmm …



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Oct 27, 2018, 3:50:11 PM10/27/18
to felix google
OK so finally this works:

/////////
noinline fun sx3(x:string,y:string,z:string) {
var a = "Sx3(" + x;
a += y;
a += z + ")";
return a;
}

reduce s3(x:string, y:string, z:string): x + y + z => sx3(x,y,z);
proc B() {
var p = "p ";
var b = sx3(p,p,p);
println$ b;
}

noinline proc A() {
var q = "hello ";
var chaos = "of chaos";
val r = q + "World ";
var s = r +chaos;
println$ s;
}

A();
B();
///////////////

So, if I comment out the B() stuff, the reduction isn’t applied.
This is because Felix is conservative with reductions.

After monomorphisation and other operations, it strips the reductions down
to ones where every symbol on BOTH the left and right hand side of the reduction
exist.

If I take out B, sx3 is no longer directly called and is removed.
Then the s3 reduction, if applied, would fail. Of course I could keep all
the symbols on the RHS of every reduction until they’re applied for the
last time, then delete the reductions, and do another symbol GC pass.
But I want to be careful because reductions are EXPENSIVE.

The POINT however, is that the s3 reduction IS applied in A() as is seen
by the output:

Sx3(hello World of chaos)
Sx3(p p p )

It survives monomorphisation because it is a monomorphic reduction.
It cannot be applied in the polymorphic reduction phase because there is no
pattern

x + y + z

in A. However after monomorphisation and val folding the val q isreplaced in the definition
of r and the result replaced in the definition of s, so s reads:

var s = “Hello “ + “World “ + “of chaos”;

and now the pattern is present. It was important to make that work because most people
are not going to write reducible expressions, most of them will arise only after substitutions
as above, including inlining.

[Although in this case the s3 reduction applied 143 times to the library which would have
made a complete mess if it had been used since the result is a cheat, since s3 doesn’t
actually just concatenate 3 strings]

Also, val folding currenlty ONLY works at —usage=production or —usage=hyperlight
so the reduction won’t work at —usage=prototype or —usage=debug optimisation levels.
[I should turn reductions off at lower optimisation levels perhaps]

What I actually wanted to experiemt with was ropes!

The idea is that strings and ropes get added together to form ropes,
which convert back to strings for any operation other than addition
(concatenation).

Actually some operations would work fine on ropes.
For example, saving a rope to a file, you don’t need to join the
pieces into a string first. Iterators and maps also don’t require a string.
However any kind of indexing operation, search, etc etc probably does.

So the idea is that rope is a subtype of a string, and a coercion is provided.
The hard bit is something like:

fun + (x:rope, y:string) =>…
fun + (x:rope, y:rope) => …

Obviously, we want to use the second overload if possible. I think this works because
it is more specialised. The real nasty is this one:

fun + (x:string, y:string) => …

which is the original operation. Another issue is: coercions are introduced during binding.
But reductions are done after that .. :-)



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Oct 27, 2018, 11:48:14 PM10/27/18
to felix google
So below is a crud rope implementation.
And here are the performance results:

~/felix>flx --usage=hyperlight rope
Hyperlight optimisation on
String length = 40960
Len = 7255941120
Strings: Elapsed = 13.3477s
Rope concat: Elapsed = 5.71422s
Len = 7255941120
Ropes to string: Elapsed = 65.757s

The concatenation performance for ropes is suprising.
Not that its faster than the native C++ string concatenation ..
the fact it is so horribly SLOW.

If I take the “rev” out of the render routine it speeds up to 43 seconds.

What’s going on? The answer is COPYING. The C++ code can use
rvalue binders to avoid most copying. A native (written in C++) implementation
might even be faster.

Now, when a list if reverse, the data values of each node is copied
from on list to the next. There is a rvalue reverse that doesn’t even allocate
any new nodes, but removing the reverse is even faster. Then if you look
at the append, its a list concatenation, which internally has to reverse the
first list and then splice that onto the second list. So copying .. even thougth
the first list could be an “rvalue”. In fact internally the reversed list is known
to be an rvalue, which is how it can do a destructive splice safely. It knows
the reversed list is unique because it created it .. but it doesn’t use the rvalue
reverse (because it doesn’t know its an rvalue).

Internally, copying the data from one node to another doesn’t do a move
but a copy. Even though lists are immutable! So the strings are too.
But the routines don’t know that.

I’m going to try using pointers to heap allocated strings.
However I suspect that

(*ps).len

is going to make a copy of the object on the list, just to obtain its length,
This is because Felix has no references. For product types, overloaded projections
handle pointers to products efficiently. But methods on primitives only work with
pointers to them if you define the overloads yourself.

//////////////////////////
Implementation
/////////////////////////////
// define a rope as a list of strings in reverse order
struct rope {
r: list[string];
}

// constructor from one string
ctor rope(s:string) => s.list[string].rope;

// appending a string just prepends it to the list
fun + (r:rope,s:string) => rope (s!r.r);

// appending a rope is just list concatenation
fun + (r:rope,s:rope) => rope (s.r + r.r);


// the length of a rope is the sum of the lengths of its pieces
fun len (r:rope): size =>
fold_left (fun (acc:size) (s:string) => acc + s.len) 0uz r.r
;

fun render (r:rope): string {
var out = "";
reserve (&out, r.len + 1);
iter (fun (s:string) { out += s; }) (rev r.r);
return out;
}

proc test () {
var s = "Hello";
for i in 0..12 perform s += s;
println$ "String length = " + s.len.str;
begin
var t = #time;
var k = s;
for i in 0 .. 10 perform k = k + k + k;
println$ "Len = " + k.len.str;
println$ "Strings: Elapsed = " + (#time - t).str + "s";
end

begin
var t = #time;
var k = rope s;
for i in 0 .. 10 perform k = k + k + k;
println$ "Rope concat: Elapsed = " + (#time - t).str + "s";
#collect;
t = #time;
var q = render k;
println$ "Len = " + q.len.str;
println$ "Ropes to string: Elapsed = " + (#time - t).str + "s";
end
}

test;


John Skaller
ska...@internode.on.net





John Skaller2

unread,
Oct 28, 2018, 2:17:37 AM10/28/18
to felix google


> On 28 Oct 2018, at 14:48, John Skaller2 <ska...@internode.on.net> wrote:
>
> So below is a crud rope implementation.
> And here are the performance results:
>
> ~/felix>flx --usage=hyperlight rope
> Hyperlight optimisation on
> String length = 40960
> Len = 7255941120
> Strings: Elapsed = 13.3477s
> Rope concat: Elapsed = 5.71422s
> Len = 7255941120
> Ropes to string: Elapsed = 65.757s


Yes. Here is the implementation using pointers to heap allocated copies of the string:

~/felix>flx --usage=hyperlight rope
Hyperlight optimisation on
String length = 40960
Len = 7255941120
Strings: Elapsed = 12.192s
Rope concat: Elapsed = 0.56186s
Len = 7255941120
Ropes to string: Elapsed = 17.5594s

Now the rope concatenation utterly creams the string concatenation as it should.
However the rendering routine is still too slow.

I added this:

Rope len = 7255941120
Rope len time: Elapsed = 0.461086s

So it takes a long time just to add a few integers.



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Oct 28, 2018, 5:40:37 AM10/28/18
to felix google
So here is an idea. The list rev function has a special property: it ALWAYS returns
a unique list. Not all list functions do that, for example cons doesn’t, the tail could be
shared.

So the idea is this, roughly:

fun urev (x:list) => box( rev x);
fun urev (x: uniq list) => box (inplace_rev x);

The only hassle is we need urev as well as rev because if you write:

var u = urev x;
var a = u;
var b = u; // ERROR Uniqueness violation

You have to do

var u = unbox (urev x);

to discard the uniquness explictly. The advantage though is you can do:

fold_left (fun (acc:A) (elt:T) => .. _) init (urev x)

and it should work because uniq T is a subtype of T. Unfortunately, or not,
Felix ONLY allows the decay of uniq T to T passing values to function parameters,
NOT in local variable initialisation, there you have to explicitly do an unbox to
discard it. Otherwise it would depend on the type of the variable:

var x : list = urev a; // x is type list
var x = urev a; // x is type uniq list

which might be a bit surprising. This can’t happen with parameters usually,
because the type has to be specified (except for generics ..)

If you write:

urev (urev x)

then the list is reversed in place twice.

I think Felix uniq types are more powerful than C++ rvalue binders BUT
they’re problematic. Temporaries (what ever that means) are always uniq.
Of course I’m not sure if that applies to data structures (like lists) or just
the head node. It’s confused by the fact algebraic lists are (supposedly) immutable.
Uniq is mainly for mutation.

I don’t like uniq. The algebra doesn’t feel right.



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Oct 28, 2018, 12:36:43 PM10/28/18
to felix google
LOL!

I did some hackery to ensure that Felix cannot be copying any strings.

I timed the Felix iter function without the string copy .. 2 seconds: BAD.
Its 171K iterators though.

I rewrite the loop, to get rid of the iter HOF:

fun render (r:rope): string {
var out = "";
reserve (&out, r.len + 1uz);
var x = rev r.r;
next:>
match x with
| Empty => return out;
| head ! tail =>
x = tail;
out += head;
goto next;
endmatch;
}

Still no go. Around 16 seconds and that’s JUST for the rendering to a string.
But,

IT SHOULD BE FASTER THAN C++.

Why? Because, I reserved the full length of the final string
so the string array is NEVER copied duing the rendering.
(Although the “return” might be copying it .. … )

But then I though ..

HEY .. my MACBOOK PRO HAS A FLASH DISK

Now, you have to remember I AM AN OLD PROGRAMMER.
I come from the days of Tape Drives and Floppy Disks.
I come from the days when a transitor radio next to the computer
was a debugging tool, picking up RF from the CPU.

These newfangled flash disks DO NOT MAKE ANY NOISE.

You getting me? Nope, you’re not old enough!!!!

So .. I swapped the order of the tests:

~/felix>flx --usage=hyperlight rope
Hyperlight optimisation on
String length = 40960
Rope concat: Elapsed = 0.559681s
Rope count = 177147
Rope count time: Elapsed = 0.464256s
Rope len = 7255941120
Rope len time: Elapsed = 0.46308s
Len = 7255941120
Ropes to string: Elapsed = 11.153s <======
Len = 7255941120
Strings: Elapsed = 27.234s <=========


Guess who’s faster now :-)

And now I returned a pointer to the final string in the ropes:

Ropes to string: Elapsed = 5.76164s
Ropes total: Elapsed = 7.25415s

So indeed, the function is copying the dang string.
So in this order, the ropes are 4 times faster than C++.

The reason .. VM. The final string is: 7,255,941,120
or about 7Gbyte. Half my computers ram. So, the first
string isn’t deallocated and the second on is pushing
Paging my silent disk heavily. I just can’t hear the disk
rattling!

The funny thing is .. now I can’t get it to reproduce.
Returning a pointer to a string:

String length = 40960
Len = 7255941120
Strings: Elapsed = 12.3468s
Len = 7255941120
Ropes total: Elapsed = 6.54693s

and with the ropes running first:

String length = 40960
Len = 7255941120
Ropes total: Elapsed = 6.49097s
Len = 7255941120
Strings: Elapsed = 27.9665s

Either which way .. ropes are faster!
At least twice as fast.

If C++ doubles the length of the reserved store when it runs out,
this is AT MOST twice the amount of copying. Which roughly
agrees with the test result.

The collector runs, it says this:

Deleting collector total time = 35.04742 seconds, gc time = 1.25201 = 3.57%
It doesn’t run in the C++ version of the test.
Swapping the order:

Deleting collector total time = 18.68382 seconds, gc time = 1.27932 = 6.85%



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 2, 2018, 10:00:30 PM11/2/18
to felix google
I’ve been thinking about how to make something like constexpr work.

I’m adding a new category of tests, unit tests.
I tried this:

typedef isd_t = int * string * double;
var x : isd_t = 1,"Hello",42.1;
var p0 = proj `0:3 of isd_t;



My first unit test failed. The grammar only allows a literal integer index for
tuple projections. Supporting the true index type is slightly tricky, because
Felix would have to check that the type is correct. The current AST terms
don’t allow that. Throwing out the type 3 can’t be allowed, it has to be checked.

But it is worthwhile?

My answer is: for literals only NO. On the other hand for a constexpr YES.
For example

constexpr val index = `0:3;
.. proj index of isd_t …

This is useful because the same “index” could index several things.
Maybe you could even have constexpr loops :-)

In Felix, as in C++, only some types would admit const exprs.
Literal strings, integers, and bools already get folded, however these
expressions don’t allow constexpr named values.

Currently “val” is almost a constexpr. However, for this to work, val substitution
would have to be done BEFORE binding.

On the other hand, macro substitution (and desugaring) currently handle folding.
And macros DO provide the feature we want like:

macro val index = `0:3;

except, it isn’t type checked.

Type checking is TRICKY in the presences of type variables!!!!
More precisely, if you have

`0:T + `1:T

assuming we could add unit sums, we still have to ensure T:UNITSUM.
If the addition we modular, we cannot do the calculation until AFTER monomorphisation.

Any which way .. the grammar won’t even allow a macro name for a tuple projection.



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 4, 2018, 6:29:35 PM11/4/18
to felix google
In order to check the compiler is doing what is expected, I’m adding a bunch
of unit tests to ensure correctness. By being systematic I’m hoping to provide
coverage. A number of omissions and errors have already been corrected.
There is a compiler refactoring in sight which could make a serious mess,
and these tests will help ensure the refactoring is correct.

The tests are also useful for tutorial purposes since they supposedly cover
all the cases with miminal examples.

The first batch of tests deals exclusively with projections.
The first one I wrote looks like this:

@title Stand Alone Tuple Value Integer (Constant) Projection Tuple domain
@felix
typedef d_t = int * string * double;
var x : d_t = 1,"Hello",42.1;
var p0 = proj 0 of d_t;
var p1 = proj 1 of d_t;
var p2 = proj 2 of d_t;
println$ x . p0;
println$ x . p1;
println$ x . p2;
@expect
1
Hello
42.1
@

This test deals with the CORE case of a stand alone tuple value projection.

Stand-alone means the projection is constructed as a first class function
before being applied. The argument of the projection operator “proj” here
must be a simple int literal. Note, typed integer literals are not allowed.
The argument of “proj” must be a constant because the codomain
of the projection varies with its domain value.

Then we have further tests of a simlar nature for

* compact linear tuples
* arrays
* arrays with compact linear base

Also we have to check

* structs
* records

because they’re products with projections too, they just use the field name
as the projection instead of an integer.

Also singletons and unit tuples need to be handled right and checked!

Next, we have to check array projections:

@title Stand Alone Precise Array value projection
@felix
typedef a_t = int^3;
var x : a_t = 1,2,3;
for i in ..[3] do
var p = aproj i of a_t;
println$ x.p;
done
@expect
1
2
3
@

Unlike Tuple projections, Array projections accepts a *variable* index.
We use “aproj” for array projections. They only apply to arrays, because
only arrays have the same codomain for all indices. Note that at
present, the index expression must be of the precisely correct type,
an integer will not do. In the above example the type is 3.

Its a bit inconsistent, since tuple projections require an integer literal,
and won’t accept the precisely correct type!

So far, we’ve only dealt with value projections. Recall all projections
in Felix ALSO work for pointers. This is the secret power algebra that
allows Felix to get rid of references and lvalues.

So we have to repeat all the above tests, but this time using pointers
instead of values. And it’s not that simple, because there are THREE
modes of pointers:

* read/write pointers &x
* read only pointers &<x (called pointer to const in C++)
* write only pointers &>x (not available in C++)

I am thinking of adding read-modify-write pointers &<>x as well.
This works like a read/write pointer EXCEPT that it allows atomic
read/modify/write operations.

In addition, for compact linear types there are TWO kinds of pointers!!
The pointer to the top level type is an ordinary machine pointer.

However a projection of that pointer to a compact linear components
is a special kind of pointer, a compact linear pointer. It uses 3 machine
words. The first is the machine pointer to the outer compact linear object,
and then there is a divisor and modulus for extracting the value.
So we need an additional (batch of) tests in which the machine pointer
is projected to a compact linear pointer which is then projected again
to another compact linear pointer, to obtain coverage.

And of course compact linear pointers have read/write attributes too.
Don’t forget, we need both proj and aproj checks!

Are we done?

Not by a LONG SHOT!

Next, we have inline projections:

@title Inline Tuple Value Integer (Constant) Projection Compact Linear Array domain
@felix
typedef d_t = 5 ^ 3;
var x : d_t = `1:5,`2:5,`4:5;
println$ x . 0 . _strr;
println$ x . 1 . _strr;
println$ x . 2 . _strr;
@expect
case 1 of 5
case 2 of 5
case 4 of 5
@

With inline projections, they’re a direct application of the index type to the
product. And, we usually allow int as well as the precise type. And,
for arrays, it can be an expression. And they have to work with pointers too.

Done yet?

NOPE!

There are two further major extensions.

First, slices. A slice is a *range* or indices.
So all the stuff above has to be repeated for slices.
A slice is used to extract a subarray, which is a contiguous sequence
of components. Its just the concatenation of the results of a sequence
of projections.

And now for the BIG one! Arrays support a special isomorphism
of the index types which gives higher rank arrays. In particular
consider

(int ^ 3) ^ 2 :>> int ^ ( 2 * 3)

This isomorphism changes an array of length 2 of arrays of length 3
into a matrix, with dimensions 2 * 3. The difference is the matrix
takes a SINGLE index of type 2 * 3, that is, the index is a tuple.

Felix also allows you coerce that to

int ^ 6

which is a linear array again. In particular, it allows you to write
some rank independent operators. For example you can add two
matrices by linearising them, adding the linear arrays, and then
coercing back to a matrix.

Now the point is, the matrix has a projection with domain 2 * 3.
So we need to support (AND THIS IS NOT DONE YET):

proj (`1:2, `1:3) of int ^ (2*3)

and of course the aproj equivalent.

And we’re STILL not done. The above is of course recursively applicable.

BUT NOW WE NEED TO HANDLE SUM TYPES.

Sums have injections instead of projections. We have three core coproducts:

* sums (dual of tuples)
* variants (dual of structs)
* polymorphic variants (dual of records**)

and of course we have

* repeated sums (dual of arrays)

** Note, polymorphic variants are not quite the dual of records
because records allow scoped labels, whilst polymorphic variants
do not.

Also note, Felix has polyrecords, which are records with
“row polymorphism”. There should be a corresponding
“column polymorphism” for polymorphic variants but isn’t.

THIS IS A LOT OF TESTS!

Why do we need all this? Well consider:

((1,2,3),(4,5)): int^3 * int ^2

is isomorphic to

(1,2,3) + (4,5) // notation??

in other words

int ^ (3 + 2)

In other words, we first select the left or right array,
and then we have projections with domain 3 and 2 respectively.
We can linearise that to int ^ 5 of course.

And note, 3 + 2 is a sum type. So this is an alternate view of the
original tuple of tuples which, as for the matrix case, allows
a SINGLE index type.

What’s IMPORTANT in this calculus is to observe we can LINEARISE
the base without losing structure, my moving the structure INTO
the index.

Why is that important? Well its a FUNDAMENTAL RULE OF CATEGORY THEORY
that functors preserve stucture. We can get POLYADIC array handling (and more),
by transforming ANY product composite into a SINGLE product with
a structured index. And then, any functorial operations on the stucture
can be performed by:

(a) transform composite structure to index
(b) flatten index
(c) perform operation on linear array
(d) unflatten index
(e) transform back to original structure


The ultimate power of this is enormous. It’s sometimes called RESHAPING.
The critical thing is:

(a) static type checking the reshaping operations
(b) OPTIMISING THE CALCULATIONS


All the calculations can be done exclusively at run time.
Python can do this with a well known extension.
But Felix can do compile time checks on the types,
and it can do the moral equivalent of constant folding
when constants are involved.

Constant folding like his has little use in quantum mechanical calculations
using matrics of double precision floats. But it is MANDATORY for cryptographic
work, signal processing, and other jobs involving heavily stuctured small scale
data.



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 5, 2018, 1:50:33 PM11/5/18
to felix google
Ok, so i have bit of a conflict here.
BTW: I added

var f = (ident of int);
println$ f 1;

i.e. an explicit identity function.


Given a record with duplicate fields,

(a=1,a=2) . a

returns the first field. But we need an operator that returns ALL of them.

The compiler actually has:

| BEXPR_prj of int * Flx_btype.t * Flx_btype.t
| BEXPR_rprj of string * int * Flx_btype.t * Flx_btype.t
| BEXPR_aprj of t * Flx_btype.t * Flx_btype.t

(* union/sum constructor wrapped as function *)
| BEXPR_inj of int * Flx_btype.t * Flx_btype.t
(* first arg = constructor index, second arg domain = ctor type, third arg codomain = union type *)

(* generalised injection, dual to array projection. *)
| BEXPR_ainj of t * Flx_btype.t * Flx_btype.t


The BEXPR_rprj is a record projection, but it takes TWO arguments,
the first argument is the field name as a string, and the second is the
sequence number. So in the example above

BEXPR_rprj (“a”,1,record_type, int)

where record_type is the record type, then the expression is the projection
of the SECOND “a” and this applied to the record returns 0.

BUT there is no syntax for this. The compiler has constructor functions:

let bexpr_rnprj name seq d c =
let bexpr_rprj name d c = bexpr_rnprj name 0 d c

so actually “rprj” just calls “rnprj” with the sequence number set to 0,
this gets the first field with the given name. This is what the user gets
when they use the name “a” as a projection.

There’s no syntax for rnprj.

HOWEVER I have a another idea, which works differently:

(a=1, a=42.7, c=“hello”) @ a —-> (1,42.7)

i.e. a projection which returns the whole tuple. This is a bit tricky with
polyrecords but lets leave that for the moment.

Now the thing is, this operator cannot fail:

var x = (a=1,a=42.7,c=“Hello”, 1,2,3);

x @ a —> (1,42.7)
x @ c —> “Hello”
x @ fred —> ()

and


x @ a . 0 —> 1
x @ a . 1 -> 42.7

In other words, @ field returns a tuple of all the fields with the given name,
which means () if the record doesn’t have that name, and then you can
apply the usual tuple projection to get the field you want. Note this ALSO means

(a=1, b=()) == (a=1)

that is, fields with value unit tuple are ignored because @ returns the same
value for both.

I meantioned I added ident of T, for the identity function. This is important
because it is the projection of a 1 element tuple!

Its not the same as

proj 0 of T

because T might be a tuple! It IS the same as the compiler term

EXPR_proj (0,T,T)

even if T is a tuple, but the user syntax only specifies the codomain of the
projection function, the domain is implied by selecting the 0’th component
of the codomain (which must be a tuple!) The notation makes sense if
T isn’t a tuple, but if it IS a tuple it would be ambiguous (did you mean
the identity function or the projection of the first component?)

That leaves the question: what is the projection of ()??

Since there are ZERO components… there are no projections.
Although .. the identity function is well defined.

That leads to an interesting issue which has an even more interesting
resolution!

() is a compact linear type. Compact linear values inside a composite
are extracted by the formula

x / divisor % modulus

For example the lowest bit is given by

divisor = 1, modulus = 2

For the compact linear type (), no information is required. It uses no space.
The divisor is undefined, it can be anything, but the modulus is 0.
That is, if you actually applied it to a compact linear type you’d be dividing
by 0!


SO the representation of

2 * 1 * 3 == 2 * 3

i.e. the unit type in the tuple can be thrown out. The problem is the index numbers.
if index 0 is 2, index 1 is 1, and index 2 is the 3, then we have a problem, if we try
to extract index 1. Because there ain’t no value there!


The formula MUST BE RIGHT. Its intuitive. This means the boundary conditions
(the representation of unit) must drop out. The only way to do this is to DELETE
the unit. Which means THERE IS NO INDEX 1. More correctly, index 1 is the 3,
there is no index 2, and there are only TWO components in the tuple. Not 3.
The unit is elided.

Felix already takes ANY function returning () and deletes its application,
replacing it with unit value. But what is that? A 64 bit unsigned integer with
NO values set into it? (Felix actually uses zero, but that’s not correct).

A pointer to a () is universally NULL. Whch means you cannot dereference it.
But that’s COOL because dereferencing is a function, and its returning unit ..
SO IT IS ELIDED.


Note also, functions accepting a sole unit, never have any parameters of type
unit. The unit is never passed. The function would have C signature

T f ()

So even though there is a value () of type unit, it is not representable!

Getting rid of unit values is not so easy. The PROBLEM is that if you have
a type

T * U

where T, U are type variables, you expect there are two components meaning
two projections. If T becomes () during monomorphisation and is elided,
the projection indices will be all messed up! I would have to adjust them
when the unit is deleted. It all a bit tricky when the unit is a parameter.
Felix does delete variables of type unit and functions returning unit
INCLUDING GENERATORS!!!

Note that unlike void, variables of type unit are just uninitialised.
The can still exist. Its just that we can safely throw them out,
provided we replace all such variables in expression with the value
unit.

I am not digressing! The question of @ operator on records
is intimately tied up. The cool thing about the model is that
we can uniformly handle tuples of 1 value by using the identity
as the projection. In other words

x @ c . proj 0 = x @ c . ident of double = x @ c
= x.c

AND NOW THE KILLER:

x @ d = ()
x @ d . 0 = ERROR
x . d = ERROR

There is no d field in x.

the point is I’m thinking of just using

x . field

to return ALL the fields, instead of the first.
If there are two, and you expected 1, you’ll get an error.
If the field doesn’t exist, you get an error.too, but not on
extracting the fields, that gives (), but when you try to select
the sole component with 0, since there is no component.


The main hassle is in polyrecords

r = (a=1 | w:T)

where T is not known, the

r @ a

is not well defined. In fact the reduction is:

1 + ( w @ a)

where + means “tuple concatenation”.



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 7, 2018, 7:12:06 PM11/7/18
to felix google
Not sure why this didn’t fail before ..

The Felix compiler current “knows” a bit about slices. Its one of those messes where
it was first defined in the library, then I added compiler support for performance.

This test code is failing with an error at C++ compile time:

var x : int ^ 100;
for i in ..[100] perform x&.i <- caseno i;
var s : int ^ 99= x . (1..2000);
println$ s.0, s.2, s.8;

Felix thinks the array is compact linear. But it isn’t!
In this case, the compiler is delegating to the library routine “subarray”.

The compiler cannot handle large slices of arrays at the moment.
Small slices are constructed by expansion, that is, with an explicit tuple:

(a.p0, a.p1, a.p2, ….)

and we don’t want the number of terms to be too large. Arrays can be huge,
we don’t want to blow the compiler out of the water with terms of billions
of elements.

So cleverly we call this:

fun subarray[
FIRST:UNITSUM,
LEN:UNITSUM,
T,
N:UNITSUM,
K:UNITSUM=_unitsum_min(LEN, N `- FIRST)
]
(a:T^N) : T ^ K
=
{
var o : T ^ K;
for i in ..[K] do
var first = Typing::arrayindexcount[FIRST].int;
var outix = caseno i;
var inpix = (first + outix) :>> N; // checked at run time?
&o.i <- a.inpix;
done
return o;
}

The existence of this routine required years of work to support the constraint handling.
But there’s a bug!

The problem is, we don’t know if T is compact linear or not. And it matters!
The expression:

&o

is always a machine pointer. But the projection generated by compact linear index i used here:

&o . i

depends on whether the array o is compact linear or not. If it isn’t compact linear,
the projection is just the C addition of an integer, if it is compact linear, its a complicated
mess in which make a pointer consisting of a machine pointer to the compact linear value,
a divisor, and a modulus. The pointer has three machine words in it.

Which code is generated depends on the TYPE OF THE PROJECTION. In this case
the domain is the same for both kinds of projection. But the codomain depends on
the base type of the array. For non-compact linear case the function type of the
projection is

&(T^N) -> &T

which goes from a machine pointer to the array, to machine pointer to an element.
where in the compiler linear case it is:

&(T^N) -> _pclt< T^N, T >

The RHS there says the element type is T, but it is stored in a compact linear array
of type T^N. Note, this is weirdo type, but you can write it, it is parsed. Values of this
Felix type have C++ type:

struct RTL_EXTERN clptr_t
{
cl_t *p;
cl_t divisor;
cl_t modulus;
clptr_t () : p(0), divisor(1),modulus(-1) {}
clptr_t (cl_t *_p, cl_t d, cl_t m) : p(_p), divisor(d),modulus(m) {}

// upgrade from ordinary pointer
clptr_t (cl_t *_p, cl_t siz) : p (_p), divisor(1), modulus(siz) {}
};

where as a non-compact linear pointer has type

T*

So why is there a bug? Well its here in the compiler:

let rec islinear_type bsym_table t =
match t with
| BTYP_void
| BTYP_unitsum _ -> true
| BTYP_type_var _ -> true (* THIS IS NEW AND WILL PROBABLY BREAK STUFF *)
| BTYP_tuple ts
| BTYP_sum ts -> List.fold_left (fun acc t -> acc && islinear_type bsym_table t) true ts
| BTYP_rptsum (count,base) -> islinear_type bsym_table base (* coarray *)
| BTYP_array (base,index) -> islinear_type bsym_table base
| _ -> false

See the note???

The compiler assumes a type variable is compact linear if it is asked if a type is
compact linear.

In the test case .. this assumption is WRONG. So the compiler is typing the expression

&o

incorrectly.

Note: you may think, well, why not just wait until we know T?

#$%^&*( THAT IS C++ GARBAGE !@$%^&*()

FELIX TYPES POLYMORPHIC ENTITIES. THE TYPE HAS TO BE KNOWN

****** BEFORE MONOMORPHISATION *******

What’s the solution??

Logic DICTATES it. We HAVE TO KNOW.
Let have a look at the representatino of type variables:

| BTYP_type_var of bid_t * kind

The first argument, bid_t is just an integer identifier for the variable.
The second argument, kind, is .. well .. the kind or meta type of the type variable.

What are the meta types??

type kind =
| KIND_type
| KIND_unitsum
| KIND_compactlinear
| KIND_bool
| KIND_nat
| KIND_tuple of kind list
| KIND_function of kind * kind

You can see, there is a kind for the compact linear case. It is a subkind of the kind
for types. Also, since unitsums are compact linear, they’re subkinds of compactlinear
and also type. So what we need is this:

fun subarray[
FIRST:UNITSUM,
LEN:UNITSUM,
T:COMPACTLINEAR,
N:UNITSUM,
K:UNITSUM=_unitsum_min(LEN, N `- FIRST)
]
(a:T^N) : T ^ K


fun subarray[
FIRST:UNITSUM,
LEN:UNITSUM,
T:TYPE,
N:UNITSUM,
K:UNITSUM=_unitsum_min(LEN, N `- FIRST)
]
(a:T^N) : T ^ K


Thatr is, we need TWO routines, with the same bodies, but different kinding
of the type variable T. The more special case, the first one, must be selected
by overload resolution if the type argument T has kind COMPACTLINEAR
because it is more specialised.

If we had KIND VARIABLES we might get away with only one definition,
but we don’t have them at the moment!

UNFORTUNATELY you also cannot overload on kinds!

Technically, we want “kind classes” like “type classes” because that
defers the choice until after monomorphisation, but that requires
kind variables.

The solution is to name the two subarray cases differently.
UNFORTUNATELY the non-compact linear case has to have
the constraint

is a type but NOT compact linear

and there’s no way to say that! There’s no such kind, and, constraints
only apply to types, not kinds.

BTW: if you’re confused by the terminology .. its just what C++ calls concepts.
In Felix type constraints can be expressed with type classes.
However type classes cannot generalise over structural typing,
that’s what kinds are for. Type classes really only work for nominal typing.





John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 8, 2018, 1:31:53 AM11/8/18
to felix google
OMG … I’m an IDIOT!

The solution is obvious now. And BEAUTIFUL. Which means its right!

The idea that a normal pointer calculation depends on the base type
being NOT compact linear is clearly CRAP.

The CORRECT idea is that ordinary pointers are SUBTYPES of compact linear pointers.

In other words, for a variable of T:TYPE, the address a plain pointer &T.
But a *projection* always yields a compact linear pointer.

If, after monomorphisation for example we find the dvisor and modulus
are 1, then, we can optimise the compact linear pointer to a machine pointer.

in fact, what were are saying is more general. Given a pointer to a variable
of the heap, when we find a product component, the BASE pointer
to the whole object should not be lost. It is actually a segment address.

An interior pointer is just the base pointer plus the divisor and modulus.
If we represent that at run time, array bounds checks are always possible.

The main issue is, for large arrays .. well we really aren’t going to divide
by billion digit numbers…

UNLESS … they’re powers of 256. I have of course said this before.
Dividing the value 256 is equivalent to subtracting 1 from the address.

So to be pragmatic a pointer has to have

(a) a base address (which is not mutable)
(b) an offset in bytes
(c) a divisor
(d) a modulus

The offset in bytes N is of course just a divisor

D = 256 ^ N

The tricky bit is that the divisor (c) has to be capped,
which requires introducing padding. In other words
a really big value cannot be compact.

We can dispense with (b) is we make (a) mutable.

The type involved, however, is already correct: the Felix
compact linear pointer representation is enough for all cases.

The only trick is normalising it. In other words, if we have
a base type which is a product of products of products of products,
and the whole thing is compact linear in theory, in practice,
we have instead to make the outer projections ordinary
pointer + offset operations, and move to compact linear mode
only when a product value fits in a single machine word.

Its a nice theory but there’s a problem: there’s no way to
calculate the size of a large value. Even with “big integers”.
I mean, the calculation can be done .. but it will take hours
and use heaps of memory.

Basically we need to calculate the maxium value (plus 1),
round it a bit, and divide by 2^ 256 to get the number of bytes.
So we actually need a fast way to avoid doing large multiplies
and divides.




John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 8, 2018, 2:42:59 PM11/8/18
to felix google


> On 8 Nov 2018, at 11:12, John Skaller2 <ska...@internode.on.net> wrote:
>
> let rec islinear_type bsym_table t =
> match t with
> | BTYP_void
> | BTYP_unitsum _ -> true
> | BTYP_type_var _ -> true (* THIS IS NEW AND WILL PROBABLY BREAK STUFF *)
> | BTYP_tuple ts
> | BTYP_sum ts -> List.fold_left (fun acc t -> acc && islinear_type bsym_table t) true ts
> | BTYP_rptsum (count,base) -> islinear_type bsym_table base (* coarray *)
> | BTYP_array (base,index) -> islinear_type bsym_table base
> | _ -> false

Actually it’s probably correct.

Hmmm. The thing is, you cannot take a projection of a type variable or pointer to one
because you don’t know its a product. However in

T * 2

you have to assume the type is compact linear or you get the wrong kind of pointer.

The problem is: let’s degrade

_pclt < T * 2, 2> with T set to int

to

&2

because after substitution the type isn’t compact linear. This one is correct.
Unfortunately, there isn’t enough information to do the degrade in general.

Suppose you have a product level 0 of some components, say one of them
at level 1 is also a product, and again at level s up to n-1.

At n-1, we also have a product but its now compact linear, so a product at
level n is compact linear, n+1, n+2, thru to n+m-1.

When you take a pointer at the outer level and project you get an ordinary
pointer. And again. You keep getting pointers until level n. At level n,
you have a pointer to a compact linear store. Now if you project, you get a compact
linear pointer. The base type (domain) is now invariant. Only the codomain changes.
I mean the type the *machine pointer* component points to.

Now all projections to level n+m-1 are compact linear with the same base.

This is what happens now with *nonpolymorphic* types. Exhaustive tests
have been done, but only for one level break (that is, two projections
that go across the type changing boundary at level n-1 to n+1).

The problem is that, if there’s a type variable in there, and we assume
its compact linear then the “machine pointer” is set early. If later
we find T is not compact linear after all, there’s no way to calculate
what the machine pointer should be because we have LOST the projection
path through the product.

For example:

_pclt <2 * ( (2 * T) * (2* (2 * V * (2 * (2 * T))))), T>

Suppose V becomes int, and T becomes 2. The compact linear level break
is then after the V on the right branch of the tree. but it is at level 1 on
the left branch. Since the target is just a T, we don’t know which branch
the projection took. So there’s no way to know wwhat the machine pointer
should be.

The only way to do this is to keep the projection path in the type.
So the argument becomes a list of types!

Then, after monormorphisation, we can track down the list until
we hit a level break.

BUT … there is a POWERFUL BENEFIT to this.

Instead of doing it an monomorphisation .. we wait until code generation
time, and EVEN IF we get a compact linear type, we keep using ordinary
pointers as we go down UNTIL THE SIZE IS SMALL ENOUGH TO FIT
INTO A MACHINE WORD.

For example, if you have an array length 512 of arrays length 32 of bool,
you get an ordinary C like array of length 512, whose components are
compact linear arrays of length 32. Because 32 <= 64 but 512 * 32 > 64.

Obviously in THAT case we can do better: an array of 256 pairs of
32 bit arrays. The pairs are compact linear so use one machine word (64 bits).

So what this all says is; we can have a SINGLE unified pointer type,
which takes a LIST of types representing the projection path.
You write the list from the whole type down to the detail left to right,
but the compiler stores it backwards, so in the compiler a dereference
as the type of the head of the list (in the written form its the last thing in the list).

Now we can do the decomposition which determines where the level
break occurs. This is where a projection requires switching from
a machine pointer (one machine word) to a compact linear pointer
(three machine words).

[Note this applies to value projections too! Because a value projection
is, underneath, the dereference of a pointer projection.]


The decomposition can also check sizes and keep going even if
we have a projection of a compact linear type in theory, if, in practice,
the compact representation is too large. “Padding” then gets added
automatically (if the type is strictly less than 64 bits).

Actually we can do even better if we apply some isomorphisms.
For example:

int * 2 * 3

is not compact linear but it is isomorphic to

int * (2 * 3)

which means we COULD store it in two machine words.
[Felix doesn’t know how long an int is ..]

In fact we could store

2 * int * 3

as

int * (2 * 3)

as long as we fiddle the projection indicies.

Even better for a compact linear type we can calculate the size,
and round it up to the next of 8,16,32,64 bits.

The thing about projections is as follows:

(a) tuple projections require constants
(b) all projection chains have a finite depth

There is an IMPORTANT issue here. With arrays we can index
them with variables. Provided we do that exclusively, arrays
obey the same rules as tuples. We cannot track the exact
projection because that would require tracking the index value
which is impossible for arrays (because we’re writing a LOOP!)

However Felix arrays (not varrays!) have a statically known length,
so out bounds acceses are impossible EVEN USING A LOOP.

The KEY point here is that there is no use for incrementable
pointers. You just use a pointer to the whole array and an index.
This is true even down multiple levels, you just use a pointer
and multiple indices to cross into deeper arrays.

In fact, the indexing always reduces to a SINGLE linear value
in theory. In practice you need TWO values, one for the machine
pointer level index, and one for the compact linear pointer index.
So you need two static constants: the size of the compact linear
component, and the size of the level break.


John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 8, 2018, 10:34:17 PM11/8/18
to felix google


> On 9 Nov 2018, at 06:42, John Skaller2 <ska...@internode.on.net> wrote:
>
>
>
>> On 8 Nov 2018, at 11:12, John Skaller2 <ska...@internode.on.net> wrote:
>>
>> let rec islinear_type bsym_table t =
>> match t with
>> | BTYP_void
>> | BTYP_unitsum _ -> true
>> | BTYP_type_var _ -> true (* THIS IS NEW AND WILL PROBABLY BREAK STUFF *)
>> | BTYP_tuple ts
>> | BTYP_sum ts -> List.fold_left (fun acc t -> acc && islinear_type bsym_table t) true ts
>> | BTYP_rptsum (count,base) -> islinear_type bsym_table base (* coarray *)
>> | BTYP_array (base,index) -> islinear_type bsym_table base
>> | _ -> false
>
> Actually it’s probably correct.


If I use a list recording the projections .. what is the unification rule????


John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 9, 2018, 1:42:34 AM11/9/18
to felix google
Well .. i need to figure out how to fix the compiler.

The general form of a read-write pointer is:

BTYP_gptr of t list

This form uses a list of polymorphic types starting with the domain and progressing
down projection chains to the codomain. The domain is the type of the whole object,
the codomain is the final type of the sub-component being accessed.

If the domain and codomain are equal, all the types are equal, and the term
reduces to a machine pointer:

BTYP_pointer of t

in particular, gptr can never have a single type in the list. A projection
of a gptr adds a new type to the end of the list. A projection of an
ordinary pointer yields a gptr with the ordinary pointers type at
the head of the list, and the codomain of the projection at the end.

The head of the gptr is always compact linear. Note, a type variable is
considered compact linear, since it might or might not be on substitution.
After monomorphisation, we can throw out gptrs and replace them with
compact linear pointers:

BTYP_pclt of t * t

However the term exists so to fix the compiler we have to deal with these
as well. So in general, a projection of a pclt produces a gptr with the list
being the two types of the pclt followed by the new codomain of the projection.

So the invariants are:

1. BTYP_pclt (D,C): D must be compact linear. D not equal to C.

2. BTYP_gptr Ts: Ts must be at least 3 elements. The head
must be compact linear.


To fix the compiler, I need to add a constructor for a gptr, the constructor
will actually make a pclt or pointer to preserve invariants.

Unfortunately this is not enough. To ensure the algorithms in the compiler
enforce the invariants, and to ensure I fix every line, I have to remove
the constructors for pointer and pclt.

It’s a bit unfortunate because code after monomorphisation doesn’t ever
need to create a gptr, it always knows to use either a pointer or pclt.
However, if I keep these constructors, there’s no way to find incorrect
code binding projections polymorphically.

I could do this if only BTYP were using polymorphic variants!

Another way to fix this is to JUST have gptr. Pattern matches know
how long lists are. Unfortunately, the gptr form has an issue:
the list can be longer than 2 elements .. in this case we’re ONLY
interested in the first and last elements. At code generation time,
we don’t need the intermediate types. So basically, at monomorphisation
time, we can throw them out.

This suggests a new invariant: all the intermediate elements must be polymorphic.

If I go with gptr only, I can also “throw in” a mode switch: RW, RO, WO,
which I wanted to do anyhow. At present, I have to write out three cases
for each of these because they use distinct constructors.




John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 9, 2018, 7:47:26 PM11/9/18
to felix google
Ok I started doing this, but I have to give up and rethink because the form of the
pointer type has to be followed by the form of pointer expressions.

The problem is, reducing a projection of a pointer. It’s all very well to
reduce the types, but that has to be sync’ed with tracking the projections
on the actual starting address.


John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 10, 2018, 8:12:17 AM11/10/18
to felix google


> On 10 Nov 2018, at 11:47, John Skaller2 <ska...@internode.on.net> wrote:
>
> Ok I started doing this, but I have to give up and rethink because the form of the
> pointer type has to be followed by the form of pointer expressions.
>
> The problem is, reducing a projection of a pointer. It’s all very well to
> reduce the types, but that has to be sync’ed with tracking the projections
> on the actual starting address.

More to the point .. i have no idea how to unify.


John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 10, 2018, 5:10:37 PM11/10/18
to felix google

Ok so I’m going to refactor the code in stages.

In phase 1, I add the new type term, and make the old term constructors
delegate to it. But I won’t bother enforcing invariants. I delete the old terms.
This means all existing constructions will work.

All pattern matches using the old terms will fail.
There is a mechanical rule for replacing them.

The main problem is implementing a program that does the
replacement. Doing it by hand will be a nightmare.

Six patterns need to be replaced. The problem is identifying
the arguments correctly. Regular expressions are bad for doing
this correctly because I have to find *matching* parens. I also have
to span line boundaries.

Doing it by hand is more reliable but it is going to take a while.
Unfortunately a partial automatic run is likely to make a mess.

But the final result should be identical behaviour to what Felix does now,
except for the diagnostic prints.



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 11, 2018, 2:55:59 AM11/11/18
to felix google


> On 11 Nov 2018, at 09:10, John Skaller2 <ska...@internode.on.net> wrote:
>
>
> Ok so I’m going to refactor the code in stages.

It’s done. Sorting out the bugs, which are basically minor transcription
errors.

At this stage, I parameterised some of the code based on the pointer
mode RW,R,W.

However I haven’t yet generalised the pointer handling. There is now
only one pointer type:

BTYP_ptr of mode * t * t list

The mode is the R,W, or RW code (there’s an N code as well for pointers
that cant read OR write).

The t is the target type.

The t list is the projection chain. For an ordinary pointer that is currently
an empty list. For a compact linear type pointer thingo, the machine pointer
is the sole entry in the list.

All the pattern matches have been hand adjusted to use the new pointer type:

(* machine pointer *)
BTYP_ptr (mode, T, [])

(* compact linear type mach, nested component T *)
BTYP_ptr (mode, T, [mach])

These will be the ONLY cases the back end will ever handle.

The front end has to generalised to allow the list to be longer,
in case some of the projection chain is polymorphic.

I still don’t know exactly how to do the unification, but what is there should
be semantically equivalent to what was there before.


John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 12, 2018, 6:18:52 AM11/12/18
to felix google
Sorry i broke the build.

Not sure why a failed unit test stops the build (but failed regression tests don’t …)


John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 12, 2018, 7:07:38 AM11/12/18
to felix google
So here is the pointer code in Ocaml:

let ismonomorphic t =
let f_btype t = match t with BTYP_type_var _ -> raise Not_found | _ -> () in
try iter ~f_btype t; true
with Not_found -> false

let rec strip_nonclt ts = match ts with
| [] -> []
| h :: t ->
if not (islinear_type () h)
then strip_nonclt t
else ts

let throw_tail ts = match ts with
| [] -> []
| h :: t ->
if ismonomorphic h then [h]
else ts


let btyp_ptr m t ts =
BTYP_ptr (m,t,ts)

let reduce_ptr m t ts =
let ts = throw_tail (strip_nonclt ts) in
btyp_ptr m t ts


Now, the originally the reduce_ptr code was meant to preserve the
invariant by being in the btyp_ptr constructor function. But it failed.

The reaon is really NASTY. Felix thinks a named type cannot be compact linear.

The problem is during binding, some really weird stuff happens with typedefs.
At first, typedefs get symbol table entries which are later removed.
The binding process goes through several passes.

The idea of the invariant is that the type of a pointer to a compact linear vale consists
of the machine address of a containing compact linear type, and the
type pointed at. The reason is, an actual pointer consists of a pointer to a
machine word containing the whole compact linear type, plus a divisor
and modulus to extract the component. The point is the type of the “top level”
compact linear type matters.

To find the top level compact linear type we just run down types of the
pointers from the starting product type until we hit one that is compact linear.
At that point, we can’t change the machine address so further projections
change the divisor and modulus instead.


As an example:

var x = true;
var y = true,false;
var px = &x;
var py1 = &y.1;

px is an ordinary pointer to a bool. py1 is a compact linear pointer,
consisting of a pointer to y AND the divisor 2 and modulus 2.

So the two pointers have different types even though they both target a bool.

The PROBLEM is my reduction throws out non-compact linear types from
the projection list, but it thinks a typedef of a compact linear type isn’t
compact linear. So I cannot apply the reduction until the binding is fully
completed.

I have attempted to apply it during monomorphisation instead.

I haven’t actually done the full chaining yet.


John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 12, 2018, 1:13:05 PM11/12/18
to felix google
Wow! This works. I don’t even know why!


var x = (1,(2,(true, (`1:3,(`3:5,`7:8)))));
println$ x.1._strr;
println$ (x.1).0._strr;
println$ (x.1).1._strr;
println$ ((x.1).1).0._strr;
println$ ((x.1).1).1._strr;
println$ (((x.1).1).1).0._strr;
println$ (((x.1).1).1).1._strr;
println$ ((((x.1).1).1).1).0._strr;

println$ (*(&x.1))._strr;
println$ (*(&x.1).0)._strr;
println$ (*(&x.1).1)._strr;
println$ (*((&x.1).1).0)._strr;
println$ (*((&x.1).1).1)._strr;
println$ (*(((&x.1).1).1).0)._strr;
println$ (*(((&x.1).1).1).1)._strr;
println$ (*((((&x.1).1).1).1).0)._strr;
println$ (*((((&x.1).1).1).1).1)._strr;

Of course its not using stand-alone projections. None-the-less the second
batch of prints IS using pointer projections that dive down to every sub-component.
THROUGH the non-compact linear to compact linear barrier.

Here’s the output:

(2,(true,(case 1 of 3,(case 3 of 5,case 7 of 8))))
2
(true,(case 1 of 3,(case 3 of 5,case 7 of 8)))
true
(case 1 of 3,(case 3 of 5,case 7 of 8))
case 1 of 3
(case 3 of 5,case 7 of 8)
case 3 of 5
(2,(true,(case 1 of 3,(case 3 of 5,case 7 of 8))))
2
(true,(case 1 of 3,(case 3 of 5,case 7 of 8)))
true
(case 1 of 3,(case 3 of 5,case 7 of 8))
case 1 of 3
(case 3 of 5,case 7 of 8)
case 3 of 5
case 7 of 8

John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 14, 2018, 7:24:16 AM11/14/18
to felix google
So .. I think this is the way forward:

I think I will remove ALL handling of compact linear types except for the code generator,

So a pointer is a pointer, and a projection is a projection, and there are no special
terms for compact linear stuff at all. The detection is done in the back end.

What this means is that applications of projections to pointers are irreducible.

So all compact linear type have the same representation at the moment.
Its a uint64_t. Pointers to *interior* compact linear types all uses the same
C++ struct, but with different arguments to the constructor.

The *tricky* bit is that the type of a pointer no longer tells which representation
to use, because the top level compact linear type is pointed at by an ordinary
pointer.

If an ordinary pointer to a compact linear type is projected, the resulting pointer
is a compact linear pointer. If a compact linear pointer is projected, the result
is ALSO a compact linear pointer.

This means, what I first said will not work! The *type* of a pointer MUST indicate
which kind it is. The existing pointer type i recently implement should do that.
It has the form

BTYP_ptr (mode, target, projectionlist)

where: mode is initially R,W,RW or N (N isn’t implemented yet but the code is there).
The target is what’s pointed at. The projection list is the forward order list of domains
of the projections that have been applied to get to this pointer type. The codomains
are the the domain indicated by the next element in the list, or the target for
the last element.

If, after reduction, the projection list is empty, its an ordinary pointer,
otherwise it has one element, and its a compact linear pointer, with
the one element being the type of the containing compact linear type.

The reduction strips non-compact linear types from the list,
after monomorphisation. These will always be the head elements of
the list. If the list is non-empty, then the trail of the list can be discarded,
it represents through nesting of the compact linear type from the
the machine value to the target. We don’t need these in the type.
In the actual representation the chain of compect linear projections
can be reduced to a single projection.

Non-clt prjections are just the usual C field accesses (or array accesses
for arrays). So the result will be

machinepointer = &(p->mem_n1.mem_n2.mem_n3 ….)
cltpointer = clptr_t (machinepointer,divisor,modulus)

The modulus is the size of the target. The divisor is the productr of the
divisors of the individual projections. This requires knowing the projection
index and the size of the types stored “to the right” of that component.
The type information for that is in the projection itself. We threw that
information out of the type but not the expression term which still
contains the complete list of projections.

More precisely, a chain of projections looks like

apply (prj3, (apply (prj2, (apply (prj1, pointer))))

Substitute “aprj” for array projections.



John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 15, 2018, 12:31:01 PM11/15/18
to felix google
So the problem is this: a &T has one piece of information and a compact linear
type pointer has two: _pclt<mach,target>.

So, if we have two routines:

fun deref[T] (p:&T) => _deref p;
fun defef[mach,target] (p: _pclt<mach,target>) => _deref p;

overload resolution has to pick one *before* we know the type of a pointer.
After monomorphisation we get the correct type, but its too late: the _deref
would work correctly in each function, but the wrong function may be chosen.


Given a type

T * bool

we don’t know if its compact linear until after monomorphisation.

If we use a typeclass, the choice is defered until after monomorphisation,
but the pointed at value is lost unless we give TWO type parameters to
the class:

class deref[P,V] { virtual fun deref: P -> V; }

which can’t work: the instances would be

instance[T] deref[&T,T] ..
instance[mach,target] deref[_pclt<mach,target>,target] { … }

The problem is the virtual is given a pointer type P, but not V.

Now we should be able to fix this with:

class deref[P] { virtual type V; fun … }

There’s a bug in the routine i need to fix, it barfed with a recursive type
which means I forgot to alpha convert. But even if it worked it would
be useless because the virtual type isn’t fixed until monomorphisation.
We need to know the type of

*p = deref p

during binding to select overloads.

So at the moment it seems that automatic detection of compact linear types
can’t work, because it can only work too late: it works after monomorphisation
but we need the type information during binding, before monomorphisation.

One solution is to explicitly construct compact linear types with
specialised product, sum, and array index operatrors:

5 * 7 .. ordinary product
5 \* 7 .. compact product

etc. Then

T \* 7

is a compact product and this means T must be constrained to a compact linear type.

Another solution is to use a barrier term

compact< 5 * 7 >


which just means the operators inside are compact ones. It would need to be more trivial like just

< 5 * 7 >

so that say arrays would be

int ^ <5 * 7>

Maybe [] could be used but by summetry

T[K]

should be a type indexed by a kind. We don’t have these yet and probably won’t.
Note that

vector[T]

is a type, a C++ vector storing Ts. So this notation is no good. If we want to compare
types < and > aren’t so good either. But { } is available.

int ^ {5 * 7}
{ 3 * 2 + 1}

It basically means “pack the value of this type into an uint64_t.

The point is now, the type tells us exactly where the boundary is.
In particular

2 ^ 3

is not compact linear but

{ 2 ^ 3 }

is. And I guess

{ T }

means a compact linear type variable:

T:COMPACTLINEAR


The PROBLEM is that we get something like:

T = { U }

to solve in unification, what do we do?






John Skaller
ska...@internode.on.net





John Skaller2

unread,
Nov 16, 2018, 10:29:11 AM11/16/18
to felix google

Well! This works as written:

/////////
instance Str[3] { fun str (x:3) => x._strr; }
instance Str[5] { fun str (x:5) => x._strr; }
instance Str[8] { fun str (x:8) => x._strr; }

proc test[T,C:COMPACTLINEAR] (i:T,j:C) {
var x = (1,(i,(true, (j,(`3:5,`7:8)))));
println$ x;
println$ x.1._strr;
println$ (x.1).0._strr;
println$ (x.1).1._strr;
println$ ((x.1).1).0._strr;
println$ ((x.1).1).1._strr;
println$ (((x.1).1).1).0._strr;
println$ (((x.1).1).1).1._strr;
println$ ((((x.1).1).1).1).0._strr;

println$ (*(&x.1))._strr;
println$ (*(&x.1).0)._strr;
println$ (*(&x.1).1)._strr;
/*
println$ (*((&x.1).1).0)._strr;
*/

println$ (*((&x.1).1).1)._strr;
println$ (*(((&x.1).1).1).0)._strr;
println$ (*(((&x.1).1).1).1)._strr;
println$ (*((((&x.1).1).1).1).0)._strr;
println$ (*((((&x.1).1).1).1).1)._strr;
}
test (2,`1:3);
/////////

First .. surprisingly, the instances work! However this:

instance[C:COMPACTLINEAR] Str[C] { fun str(x:C) => x._strr; }

does not. It fails with

Function 13100[bool,3 * (5 * 8)]
instance parent 13096[<T13097> * <T13098>]
instance vs= T<13097>:TYPE,U<13098>:TYPE
defined: /Users/skaller/felix/src/packages/core_type_constructors.fdoc: line 615, cols 4 to 55
614: instance[T,U] Str[T*U] {
615: fun str (t:T, u:U) => "("+str t + ", " + str u+")";
****************************************************
616: }

CLIENT ERROR
[flx_frontend/flx_typeclass.ml:741: E366] No most specialised instance!

Which seems correct. Interesting, _strr delegates to str if necessary,
but here, str is delegating back to _strr. :-)

Anyhow the KEY thing to see in the test case is that this fails:

println$ (*((&x.1).1).0)._strr;

[Flx_lookup.bind_expression] Inner bind expression failed binding (_strr (deref (0 (1 (1 &(x))))))

Flx_lookup:inner_bind_expression: unknown exception File “src/compiler/flx_core/flx_btype.ml”,
line 751, characters 33-39: Assertion failed

And the reason is simple enough: it’s trying to find the size of a type variable.
The interesting thing si that the other cases don’t fail BECAUSE the size isn’t required.

But that is BY LUCK! Because it happens the type variables comes first in the tuple,
It the order was reversed, the size would be needed for all the other projections
and pointer projections. The size isn’t needed printing the actual type variable
because .. it prints like so:

(typevar?:2,(true,(typevar?:case 1 of 3,(case 3 of 5,case 7 of 8))))

So by LUCK all the value cases work.


Now here’s the thing: with an explicit chain of projections we CAN do the calculations
after monomorphisation, although _strr will NOT work. This is because its a HACK.
The bound type is analysed and _strr generates UNBOUND code which is then bound.
It shouuld generate BOUND code.

If the chain isn’t explicit, for example, the projections are stand alone and store in variables
first, then applied, so we know the type of the projection, but not the actual projection,
it should STILL WORK because we can do the calculations at run time. Although there’s
no way to generate compact linear values and projections at run time at the moment,
the C++ types can do the calculations. After all, its just division and modulus, which works
with arbitrary expressions not just constants. Of course .. the compiler generated stuff
is type checked.

The key problem is to pick the right deref function which picks _deref. The problem would
go away if the parser generated _deref. But at the moment, the parser generates
deref, which is overloaded, and both overloads call _deref, but with non-compact and
compact types, respectively.

The overload picks the wrong function. I like the overload because I want the
syntax to work for abstract pointers too.

IN FACT .. it might be interesting to make the abstract pointer type used in all cases
and specialise it after monomorphisation (though, I don’t know how to do that).

There HAS to be a way to make this work by just delaying evaluations that can’t
be done prior to monomorphisation.



John Skaller
ska...@internode.on.net





Reply all
Reply to author
Forward
0 new messages