> On 18 Sep 2020, at 17:27, Gordon Childs <gordon...@gmail.com> wrote:
>
> This is great - so can I now pick an objective-c class (say AVAudioFile) and, after some wrapping with the DSL, use it in felix?

Yes, that’s the idea. you can already do this now, it’s just more laborious to do the wrapping.

Some further rules:

* A return type of instancetype is not allowed at present. Write the class name instead.

* Class methods do not have an object argument. instead, the Class name is prepended
to the method name. For example in the test case:

/////////

open syntax ObjC;
@objc-bind
@interface Sub : Super

+ (NSString) cmeth;
- (int) imeth;

- (void) xWith: (int) yWith: (NSthing);
- (int) xWith: (int) yWith: (NSthing) zThing:(double);

@end

println$ "Hello World”;
////////

we have

fun Sub’cmeth : unit -> NSString

generated. Note this means you cannot form closures for class methods.

* Instance methods always accept "exactly two arguments".
* The first argument is the object
* the second argument is the rest of the arguments as a tuple or () if there are none

For example

fun imeth: Sub -> unit -> int // NOTE unit argument!!
proc xWith’yWith : Sub -> int * NSthing // Note IMPLICITLY returns void
fun xWith’yWith’zThing: Sub -> int * NSthing * double -> int

This means you can always form a closure of an instance method over the object,
and more pointedly you MUST do so. In particular

var j:int = imeth obj ();

Note carefully to call this one you have to supply the unit value as well as the object.
Also note now reason for this:

var j: int = obj.imeth (); // Recall operator . just means reverse application

This works because application is left associative and so

var j:int = (imeth obj) (); // shows associativity

I have no idea is closures will work correctly with ARC. in general closures form
cycles which defeat reference counting. However this is unlikely here because
they’re effectively closures over bindings which are unlikely to be passed as
arguments to other ObjC bindings.

Note this means method names CAN BE OVERLOADED on the object type.

* If the return type of a method is “void” Felix uses a procedure otherwise a generator.
It HAS TO BE SPELLED “void”.

* there’s no support yet for ivars or properties. These aren’t hard but I’m confused about
things like @private and readonly property. I also suspect in most Apple interfaces there
are just methods exposed.

* Protocols aren’t supported yet. These are just class templates. So it shouldn’t
be hard to add them.

DEBUGGING WILL BE NASTY. As it always is when the compiler generates stuff
and you get an error, you end up with references to generated code or, quite often,
reference “up in the air” because the generated code has no source location.

CAVEAT. If you have an ObjC method like

- (NSString*) description;

the Felix DSSL has to read:

- (NSString) description;

This is because

type NSString = “NSString*”;

i.e. in Felix the “pointer” is already there.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 18, 2020, 8:25:40 AM9/18/20

to felix google, Gordon Childs

And .. THIS WORKS:

////////////////////////
include "std/apple/Foundation";

open syntax ObjC;

header small_class_interface = c"""
@interface SmallClass: NSObject { }
- (int)get1977;
@end
""";

body small_class_implementation = c"""
@implementation SmallClass
- (instancetype)init {
self = [super init];
return self;
}
- (int)get1977 {
return 1977;
}
- (int)getsum: (int)toadd {
return 1977 + toadd;
}

@end
""";

requires small_class_interface, small_class_implementation, package "foundation", package "ObjC";

@objc-bind @interface SmallClass
-(int)get1977;
-(int)getsum:(int);
@end

var sc : SmallClass = cexpr[SmallClass] "[[SmallClass alloc] init]" endcexpr;
println$ "Get: " + (sc.get1977()) . str ;
println$ "Add: " + (sc.getsum' 42) . str ;
///////////////////

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 18, 2020, 9:06:27 AM9/18/20

to felix google, Gordon Childs

Class methods (at least with no arguments) now work too:

@objc-bind @interface SmallClass
+(SmallClass)alloc;
-(SmallClass)init;

-(int)get1977;
-(int)getsum:(int);
@end

var sc : SmallClass = cexpr[SmallClass] "[[SmallClass alloc] init]" endcexpr;
println$ "Get: " + (sc.get1977()) . str ;
println$ "Add: " + (sc.getsum' 42) . str ;

var sc2 : SmallClass = #(#SmallClass'alloc.init);

println$ "Get: " + (sc.get1977()) . str ;
println$ "Add: " + (sc.getsum' 42) . str ;

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 19, 2020, 12:35:31 AM9/19/20

to felix google, Gordon Childs

I am changing the API for argumentless methods for compatibility with
(the very few) existing bindings in the Felix apple library.

So for example

-(SmallClass)init;

will be called by

object.init

instead of

object.init ()

The latter is more consistent, but is extra verbose considering almost all methods will
be called rather than object closures formed. You can easily make a closure like this

{ object.init }

anyhow. The triggering issue was that I just implemented this:

fun description : NSObject -> NSString = "[$1 description]”;

which follows the latter convention, not the former. It is currently

IMPOSSIBLE

to give curried arguments in bindings. In fact the alternative using
inline bindings never occurred to me before this project:

fun f(x:int) => cexpr[int]”$1” x endcexpr;

is equivalent to

fun f: int -> int = "$1”;

but more verbose BUT it allows full Felix typing including currying because,
quite simply, the first f function is actually a full Felix function, not a binding.
It effects a binding using an inline binding, cexpr .. endcexpr instead.

THERE IS ALWAYS A CLASH BETWEEN ACTUAL SIMPLICITY of the
underlying system and WHAT THE USER PERCEIVES AS SIMPLE based
on minimal syntax of common cases.

Another example of this clash is the existance of “constant type constructors”
copied from Ocaml’s design, when in the abstract all type constructors should
be functions, and constant constructors should be constant functions (acceptig
a unit argument). But do you REALLY want to write #True everywhere instead of
just True??

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 19, 2020, 7:19:24 AM9/19/20

to felix google, Gordon Childs

So what to do with properties?
=========================

In the binding code,

@property T v;

is exactly the same as

-(T)v;
-(void)setv:(T);

and

@property (readonly) T v;

is just the first, so it’s a totally useless wank. The syntax is also bad because it’s C like.
So better would be just

@var v:T; // read write
@val v:T; // read only

none of the other property attributes make any sense in an interface as far as I can see.

Protocols
========

As far as I can see a protocol is just an abstract type. So declaring one is exactly
the same as declaraing a class, in other words

@protocl = @interface

do exactly the same thing. Now suppose we have a method fred in a protocol hasFred and

@interface X<hasFred>

then we expect

x.fred

to work, as if the interface of X specified a fred method. Now by my theory
the @protocol generates:

type hasFred = “id”;
fun fred: hasFred -> thing = “[$1 fred]”;

exactly as if it were a class interface, and,

supertype hasFred : X = “$1”;

is generated for class X. This means there is an implicit coercion
from objects of type X to those of the hasFred, and therefore

var x = cexpr[X] “[[X alloc] init]” endcexpr;
.. x.fred ..

will work and return a thing. That’s because it really reads

fred x

and fred requires a hasFred, but X is a subtype of hasFred so it will work.

Protocols ALSO may be SPECIFIED to conform to other protocols.
This does NOT mean they have the extra method injected in, no,
its exactly has above: it’s nothing but a subtyping coercion.

in summary the Felix algebraic type system is so vastly superior it can
already handle protocols exactly the same way as supertypes.
From an interface view, this is just multiple inheritance.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 19, 2020, 9:19:06 AM9/19/20

to felix google, Gordon Childs

ok so protocols are now also available:

@objc-bind @protocol hasDescription
-(NSString)description;
@end

Exactly the same as a class interface except no super class, no ivars,
just methods. Class methods are implemented as well but because of how
I did them, you end up with an attempt to call the method on the protocol
name. This suggests using AnyClass as the type of a class, so that
class objects exist, and then having to pass it. In this case a protocol
class method is called on a dynamically passed value of the type AnyClass,
which is an actual pointer to a class object, and dynamic dispatch should
make it work. Not sure on this so at the moment class methods in protocols
are just an accident :-)

Once a protocol binding is defined you can use it in a class or protocol
interface:

@objc-bind @interface SmallClass<hasDescription>

+(SmallClass)alloc;
-(SmallClass)init;
-(int)get1977;
-(int)getsum:(int);
@end

To prove it works:

var sc : SmallClass = #SmallClass'alloc.init;
println$ "Get: " + (sc2.get1977) . str ;
println$ "Add: " + (sc2.getsum' 42) . str ;

// SmallClass is a subclass of NSObject by default
fun require_subtype (x:NSObject) => (x.description).str;
println$ sc.require_subtype;

// SmallClass obeys protocol hasDescription
fun require_description(x:hasDescription) => (x.description).str;
println$ sc.require_description;

The protocol is an incomplete type in C++, namely this:

struct hasDescription *

I have no idea what ObjC does, but we don’t care at the moment.

As you can see, pointer subtyping is used both for class super types,
class protocols, and protocol protocols too.

Optional methods in protocols are not supported because they don’t make
any sense.

NOTE CAREFULLY: The bindings are TYPE SAFE.
PROVIDED … the bindings accurately reflect what they bind to.
Its the same as for C bindings. if you bind Felix type int to C type float
you can’t expect things to actually work :-)

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 19, 2020, 9:12:32 PM9/19/20

to felix google, Gordon Childs

So a small syntax change allowing this:

objc-bind
@protocol hasDescription
-(NSString)description;
@end
requires small_class_interface, small_class_implementation, package "foundation", package “objc”
;

The @ is gone from the top level keyword “objc-bind” and a “;” is required at the end of the
statement. Same for class interfaces. You can now add requirements, they’re simply
added to the type involved, I didn’t bother with methods. I probably SHOULD add them
to class methods (because there’s not necessarily a use of the class instance type!)

===================

Now there’s an issue with protocol class methods. Consider:

objc-bind
@protocol hasDescription
-(NSString)description;
+(NSString)classname;
@end
requires small_class_interface, small_class_implementation, package "foundation", package “objc”
;

This will generate a constant method:

fun hasDescription’classname () = “[hasDescription classname]”;

which won’t actually work because AFAIK hasDescription doesn’t exist.
In Felix the binding is:

type hasDescription = “struct hasDescription*”;

which is a dummy incomplete type. The Felix type has to exist to allow
coercions, in the specification:

objc-bind
@interface SmallClass<hasDescription>
+(SmallClass)alloc;
-(SmallClass)init;
-(int)get1977;
-(int)getsum:(int);
@end

requires small_class_interface, small_class_implementation, package "foundation", package “objc”
;

Felix will generate this:

supertype hasDescription: SmallClass = “$1”;

So that this works:

fun require_description(x:hasDescription) => (x.description).str;
println$ sc.require_description;

The actual function will get a pointer of type SmallClass* and call the description
method on it like so:

cexpr[NSString] [$1 description] x endexpr

The point is the supertype coercion does NOT erase or change the type,
the type WILL NOT be struct hasDesciption*, in fact I think I could have left
the type binding blank like so:

type hasDescription = “”;

In other words its a phantom which enables type checking but doesn’t impact
the generated code.

So the issue is: what the heck do we do with class methods, in protocols
particularly, but the issue also impacts classes. The alternative for classes again
is that ObjC has an actual type

AnyClass*

which is the type of all pointers to class objects (NOT instances!).

This means you can apply ANY method to any class, and it may or may
not work at run time but there’s no way to type check it. By contrast the
existing mechanism;

SmallClass’alloc ()

is type safe in the sense the argument () is type checked and the return type is known
to be SmallClass (in Felix, in C SmallClass*).

NOTE ALSO: can you call a class method on an instance object? I don’t know.
If you can that won’t work at the moment.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 19, 2020, 11:05:57 PM9/19/20

to felix google, Gordon Childs

I’m now looking into clang blocks. More generally closure technology.
A good article although old:

https://www.mikeash.com/pyblog/friday-qa-2011-06-03-objective-c-blocks-vs-c0x-lambdas-fight.html

1. FELIX CALLBACKS

To do callbacks, Felix has a callback construction. What this does is create
a special type and a thunk. i will report details later because I’ve forgotton exactly
how it works. The construction gives a C function signature with one modification:
the function type name is repeated in the argument list in one place. This signifies
that the corresponding argument is a void * which is assumed to point at
some object that holds context for the C function, the client data pointer.

Felix generates a thunk which assumes that the client data is a Felix function value,
which underneath is a pointer. It then casts the void * to the appropriate Felix function
type and invokes the apply() method with all the other arguments passed to the
callback. Therefore, you have a full scale normal Felix function operating as a callback.
If it’s a procedure the additional constraints of a function apply: you cannot do SVC requests
and thus cannot do channel I/O. The garbage collector, however, should work, because
it saves the machine stack using a longjmp and restores it after a collection.

Of course for some callbacks this can’t be allowed, the GC does allocations, takes
ages to run, acquires and releases system locks, etc etc. GC can be inhibited for
some types on allocations but this may not be enough. The programmer must ensure
the function will meet the requirements, the compiler will not help (at the moment).
I note in particular calling functions in the general case causes a heap allocation!
It’s often optimised away but no promises! The only work around is to use the cfun
binder, which guarrantees to generate a C function with the specified signature
which means any attempt to access the thread frame (and thus GC) will lead to
a C++ compile time error.

To use a Felix callback you call the registration function passing the thunk as the
C function, and the Felix function as the client data.

Therefore, to use this feature BOTH the callback type and registration function
provided by the client API must include an identifiable client data pointer argument.

It’s messy to use this machinery, and last I looked there was a problem that one
of the types was wrong: in one case it’s one thing and in another case another thing,
and there’s no way to cast it to meet both requirements. So the result has a type safety
loophole. I'll investigate further.

PROS: works with all common API that model callbacks using client data pointers
and C functions which is almost all of them. in particular, it’s platform independent,
so it works with all compilers on all OS.

CONS: its complicated to use and doesn’t enforce constraints imposed by Felix
itself, nor those imposed by the client API.

2. CLANG BLOCKS

Now we get some messy issues. These things are well typed and work in C, C++,
and ObjC and ObjC++ in general, but ONLY with clang. They’re safe in the sense
that they copy captured values at block creation time so continue to work even if
the containing scope ends its life. The block object itself starts on the stack and gets
copied to the heap on demand, which means pointer to it are unreliable.

A pointer to a possible heap allocated value is used if the variable being referenced
in the block is designated __block. I think these start on the stack too but also get
copied to the heap.

Block objects are not reentrant. If you copy one, you’re copying a pointer only.
This means local variables of the block are shared. So you have to use
a special Block_copy function to copy it, and this is true if you want to pass the
block outside the lifetime of the stack. Messy! The user has to THINK.

PROS: share __block variables are managed automatically
The syntax gives blocks a definite type

CONS: platform/compiler dependent
Blocks crash if the containing scope ends, unless explicitly copied onto the heap

3. C++ LAMBDAS

These things provide greatest flexibility about capturing rules. They’re first class
values, so cannot become invalid, ever. However variables captured by reference,
or as usual anything pointed at, can be invalidated when the lifetime of the
refered to thingo ends. The BIG DOWNSIDE and it is a HUGE DOWNSIDE
is that they have no definite type. It’s close to saying this stuff is utter crap.
It works when the API requires a generic function like object but is useless
when a specific callback type is required. Conversion machinery does exist
but is fraught with problems.

PROS: platform independent
CONS: no definite type

SUMMARY
=========

Lots of people on Apple use Clang blocks because the API’s like Grand Central Dispatch
are designed for them. But supporting a compiler specific extension in Felix when that
extension is a general function type requires extensive changes: basically I would have
to add a new function type to Felix, which impacts the whole compiler.

A weaker approach may be to use opaque but polymorphic types somehow, and put the
machinery in a DSSL.

Perhaps the BEST method is to use Felix callback machinery AND some kind of DSSL.

However if we have to handle client supplied blocks, or face an API that does not provide
traditional C callback handling, or deprecates it, this could be problematic.

All in all, research must continue. The big problem is that Felix is designed assuming
heap allocations are OK and this just isn’t the case inside most callbacks.
One solution is to provide an alternate allocator but to use two allocators may have a performance
impact .. we need to keep at least a separate Judy array or some extra data to determine which
deallocator to use. The performance issue can be avoided for system objects by marking the type
as exclusively using the fast internal allocator. Also some other method could be used to
distinguish the allocator used, eg by just examining the address. That would support a fixed
pre-allocated block to speed up some early allocations.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 20, 2020, 12:39:38 AM9/20/20

John Skaller2

unread,

Sep 20, 2020, 9:40:21 PM9/20/20

to felix google, Gordon Childs

Ok so now, I think everything requires is working EXCEPT

1. instancetype.
2. Class methods in protocols

The latter requires some more research. At present a class method in an instance
is modelled by just using the class name as part of the method name, for example:

@interface MyClass +(int)cmeth; @end

println$ MyClass’cmeth ();

This is tested:

...
+(SmallClass)alloc;
-(SmallClass)init;
…

var sc : SmallClass = #SmallClass’alloc.init;

Butr this isn’t much use is protocols. The alternative is to required the class object
as an argument AND/OR an instance. So it would work like:

SmallClass . alloc

or even

sc . alloc

if we allowed an instance. But this needs a type AnyClass to be defined which is
the type of class objects, and then we need to defined a constant of that type for
the class object:

const MyClassObject : AnyClass = “&MyClass”;

assuming the RHS actually works which assumes MyClass is a static
object. Something like that does work but the name is not as shown.

The downside of this is that all class methods would be global so for example
declaring “alloc” would be a problem, since there would only be one. I need to find
a better way to do this probably with a special function that accepts a selector value.

Anyhow, more research is needed there.

The instancetype problem is daunting. At the moment, the system cheats.
The AST from the parser is mapped into binding code in sex2flx in Flx_desugar,
the bindings are all ordinary Felix bindings so that is the total extent of the
invasion into the compiler. In fact it is only done there because its easier to decode
things in Ocaml than Scheme .. it all COULD have been done in the parser.

But instancetype requires tables of methods to be kept, and the bindings of
inherited methods deferred until all the tables are generated and partially
populated with explicit methods. Then the tables have to be filled in using
the transitive closure of non-overridden methods returning instancetype.

The actual methods so inherited are just the original ones with a downcast,
although build a function in Felix is laborious it’s really only one routine.
The big problem is that the tables have to be kept, installed in the symbol
table, and then populated by the transitive closure (AFTER installing in
the symbol table, and using a suitable topological sort to find the transitive
closure). Then the extra inherited methods get added to the symbol table
and the actual tables get deleted.

Its probably easier to use an auxilliary table. In that case the job can be
finished earlier, as soon as the separate parse trees of a program are merged
into a single tree.

It remains also to handle optional methods of protocols. This is normally done by
allowing a test to determine if a method is supported, but that’s standard library
stuff I think.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 21, 2020, 7:22:34 AM9/21/20

to felix google, Gordon Childs

Here’s a bad idea. All class instance and protocol instance methods have one
of the shapes

D -> C
or
D-> A -> C
or
D -> 0
or
D-> A -> 0

Suppose we have a special phanom type

instancetype_t

This type is bound as usual but it is a phantom so must be eliminated.

1. The variable C is the only one that can be instancetype_t
2. Only functional methods can have C = instancetype_t

So now it is possible by scanning the symbol table to FIND all these
methods.

Now, we ask, what is D? By construction is is always the type of the class
or protocol in which the method is defined. Therefore, we know the
containing class of all such methods.

Now, we could replace instancetype_t with D, to get a correctly formed
statically type safe method and we shall do so but we shall also do MORE.

Suppose we call the method with object type B. If the call succeeds,
then B must be a subtype of D. In this case the return type should also
be set to B.

We can do this with a trivial wrapper for the first case:

fun wrap1[D,B] (meth:D -> instancetype_t, obj:B) =>
C_hack::cast[B] (meth obj))
;

We’d prefer this:

fun wrap[D,B] (meth:D->instancetype_t) (obj:B) => …

however I’m not completely sure that will set B.

Note that despite the cast, it is typesafe! The function will fail
if meth cannot be applied to obj, in otherwords B has to be a subtype
of D for it to work.

The only hassle is type variables cannot play with subtypes directly.
So we must defer the correctness check like so:

fun wrap[D,B] (meth:D->instancetype_t, obj:B) =>
(meth (obj :>> D)):>>D
;

Coercions always succeed polymorphically and fail after monomorphisation.
We have to allow instancetype_t to exist now and be coerced to anything.
this one needs to be implemented with a static_cast. To be sure we can
defined instancetype_t now to be void* in C, which can always be cast
to any other pointer type.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 21, 2020, 12:29:20 PM9/21/20

to felix google, Gordon Childs

Yeha!

It works!! I hacked the compiler a bit. First we have:

objc-bind
@protocol Cpy
-(instancetype)cpy;
-(instancetype)cpywithmsg:(NSString);
@end
;

The actual objective C class defines these method for SmallClass:

-(instancetype)cpy { return self; }
-(instancetype)cpywithmsg:(NSString*)s { NSLog(s); return self; }

the Felix code declares a protocol Cpy:

objc-bind
@protocol Cpy
-(instancetype)cpy;
-(instancetype)cpywithmsg:(NSString);
@end
;

which is used in the SmallClass interface

objc-bind
@interface SmallClass<hasDescription, Cpy>

{
int x;
y: int;
}
+(SmallClass)alloc;

-(instancetype)init;

-(int)get1977;
-(int)getsum:(int);
@property int z;
@property (readonly) q:int;
@end

requires small_class_interface, small_class_implementation, package "foundation", package “objc”
;

You will notice the return type in the protocol is “instancetype”.
The method is not declared in SmallClass directly, and it is NOT INHERITED.

Instead I used a trick to make this work:

// Covariance
var sc2 : SmallClass = sc.cpy;
println$ "sc2.z=" + sc2.z.str;
var sc3 : SmallClass = sc.cpywithmsg' @"Copy with message";
println$ "sc3.z=" + sc3.z.str;

Notice, cpy IS returning a SmallClass, the type of sc2 and sc3 is explicitly specified
so it MUST agree with the return type of cpy and cpywithmsg’. Actually this syntax
now allows the RHS to be a subtype of the specified type, but we have the OPPOSITE
situation here: SmallClass is a subtype of Cpy!!!

It can only work by downcasting the return type of Cpy to SmallClass and that is what
is done. AUTOMATICALLY.

The second case is similar to the first but the first is easier to explain.

When Felix sees an application

method object

which has type instancetype, the expression is coerced to the type of object.
That’s it. We KNOW its an Objective C method because that’s the only place
we should be using instancetype. Also ALL method have one of two signatures:

D -> C
D -> A -> C

are the only two options, because, I generate them like that, C has three cases:

1) 0 if the method is a procedure (not interested)
2) an ordinary type (not interested)
3) instancetype

In case three we simply coerce the result of applying the method to a value
of type OBJECT to type OBJECT.

In fact instancetype is a special compiler known type, but it is actually set
equivalent to void*. Since an instancetype is the type of an instance of a class
and ALL such types are pointers to some structure named after the class,
a simple cast of pointers is done in the backend.

the covariance detection relies on the fact that with ObjC OO the argument
of all methods is an instance type.

NOTE: none of this has ANY impact on the underlying objective C code.
This is purely to get rigidly type safe Felix bindings. You can of course
still do any kind of Objective C junk and if your interfaces do not correctly
type the objective C stuff will not work.

This is the last major feature required to make instance methods work.
Class methods of class should also work. It’s not clear what class methods
of protocols would do since there’s no way to specify what the class is.
[We’d need a special type like classtype and there’s no way to deduce that]

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 21, 2020, 6:38:16 PM9/21/20

to felix google, Gordon Childs

Here’s how you make a Felix closure into a block:

var cls = fun (x:int, y:double):int { stuff .. return result; }

type myblock_t = “int (^)(int,double)”;
…. cexpr[myblock_t] "^int(int x, y:double) {return $1->apply(x,y);}” cls endcexpr ...

We can generalise myblock_t a bit:

type block2_t[R,P1,P2] = “?1(^)(?2,?3)” (R,P1,P2);

this is for two arguments, rinse and repeat for N=0,1,3,4, etc. The block cexpr can be generalised:

fun block2[R,P1,P2] : (P1 * P2 -> R) -> block2_t[R,P1,P2] =
“?1(^)(?2 p2,?3 p3)]{ return $1->apply(p2,p3)”
;

So t he block2 function takes a Felix function as an argument and returns a block.
Again we have to repeat for multiple arguments. This block captures the Felix function,
which is a pointer, from the environment.

block2 (fun (x:int, y:double):int {return x; })

should work. The Felix function has enough type information to let block2 deduce the types.
The actual function will be heap allocated and have some magical C++ class type,
so the expression will resolve to a pointer to that class, and that class will have an apply
method.

The HASSLE is this: Felix garbage collects things. But when you dispatch this block
to some system function or ObjC Api thing, Felix can no longer reach it so it might
be collected. To fix that we have to make it a root to prevent collection. That’s easy.
The problem is unrooting it when it is no longer required. Be nice if ARC could do that
but I suspect we would just tolerate leaks :-)

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 21, 2020, 10:09:52 PM9/21/20

to felix google, Gordon Childs

So here’s an actual working demo:

/////
header block0_typedef = """
template<class T>
struct block0 {
typedef T (^type) (void);
};
""";
type block0_t [R] = "block0<?1>::type" requires block0_typedef;
fun invoke0[R] : block0_t[R] -> R = "$1()";

var x = 42;
var f = cexpr [block0_t[int]] "^int { return $1; }" (x) endcexpr;

println$ f.invoke0;
x = 99;
println$ f.invoke0;
///////////

The header above is required because Felix does not support C type syntax.
When you say

type T = “U”

Felix just says

typedef U _t8476798;

This works fine for “int” and “int*” but not function types. So you have to first
make a typedef in C that reduces the function type to a syntax Felix accepts.
I used an old trick, a nested typedef, but the more modern

template<class T>
using typedefname<T> = typeexpr_using_T;

from C++ would work too. I should populate the C++ headers with these things.
(They’re needed for plugins already I think .. did I already do it??)

Having done that I can then use the C++ template to make a type binding.
Note that ?1 is replaced by the first type variable’s binding on use.
Finally I can define the invoke function to call a block0.

Having done all that, we can now defined a block using a cexpr, and then
invoke it. Note this block captures x.

Apple/Clang say they capture by value at the time the clousre is created.

WRONG. Apple LIES AGAIN.

The value of x is changed from 42 to 99 AFTER the closure is created
and stored in variable f. The second invocation prints 99.

Here is the ACTUAL CODE FELIX GENERATES:

void _init_(thread_frame_t *ptf){
ptf-> x = 42; //init
ptf-> f = ^int { return (ptf-> x); }; //init
{
_a17576t_65778 _tmp65786 = ::flx::rtl::strutil::str<int>(((ptf-> f)())) + ::std::string("\n") ;
::flx::rtl::ioutil::write(stdout,((_tmp65786)));
}
fflush(stdout);
ptf-> x = 99; //assign simple
{
_a17576t_65778 _tmp65787 = ::flx::rtl::strutil::str<int>(((ptf-> f)())) + ::std::string("\n") ;
::flx::rtl::ioutil::write(stdout,((_tmp65787)));
}
fflush(stdout);
}

ptf is a pointer to a struct of type thread_frame_t which contains the global data.
So ptf is on the stack whilst this particular function is executed, but x is not.
So I am guessing it copies ptf when the block is created, which has no effect,
it doesn’t copy ptf->x, the actual expression representing what is called x in Felix.

In particular blocks created inside C++ class non-static members will probably
just copy “this” not the non-static member variables of the instance.

Similarly blocks WILL NOT WORK AS ADVERTISED IN OBJECTIVE C.
Because in Objective C everything is a pointer to a structure, and only
the pointers will ever be copied. The only exception is int etc.
ObjC passes all class typed things by reference like almosty all OO languages.
And Felix passes the context of closures the same way: as a pointer to the
lexically enclosing functions frame object, which is precisely a C++ class instance.

Perhaps blocks will work in functional C++ code because C++ uses value semantics
for types like strings, vectors, etc etc. Whole data structures get copied if you pass
by value, using copy constructors. But this will ONLY apply to the machine stack,
not the context of methods implicitly refered to be this (well I haven’t tried it but
as a language designer .. it really HAS to work as descrbed). Its actually amazing
the compiler can tell when a variable refered to in a block refers to an upscope
automatic (stack) variable.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 21, 2020, 10:30:19 PM9/21/20

to felix google, Gordon Childs

Hmm, here’s something I didn’t implement:

id<p1, p2>

in ObjC this is an opaque type providing protocols p1 and p2.

In Felix, if some value x provides protocol p1, you can just write

fun f(x:p1) => …

because p1 is a type. You can dispatch all the methods of protocol p1 on x safely.

But there is no type p1_and_p2. A value conforming to both can be coerced to either.
this trick fixes it:

fun f(x1:p1, x2:p2) => ….
.. f (x,x) ...

but is hardly satisfactory. Protocols have record semantics, that is, if one has the same methods
as another and then some extra ones, its a subtype. Subtyping means adding methods.

Therefore a protocol with all the methods of both p1 and p2 is a subtype of both p1 and p2.
In Felix record type addition can express this I think. Have to check.

But we need a way to express it in the binding to objective C and I don’t have one.
We want to write:

fun f(x:(p1 + p2))

meaning “conforms to both p1 and p2”. I think this is actually an intersection type.
It’s a bit weird that an intersection EXPANDS the methods of a type but that’s the case
for the same reason adding methods to a type makes it a subtype (smaller not bigger).
you can think of adding methods as making the type “more specialised” and then it makes sense.

The problem can be easily solved by creating a new nominal type:

class both<p1,p2>;

Done. But this is really bad. We really do want a combinator. Felix actually has intersection
types but they’re not used because I never figured out the semantics. But this is a case
where we actually need them.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 21, 2020, 11:57:51 PM9/21/20

to felix google, Gordon Childs

Ok so I have introduced some convenience definitions:
In C++ header files:

template<class R,class P1, class P2>
struct _type3 {
typedef R (*cfun)(P1,P2);
typedef void (*cproc)(R,P1,P2);
#ifdef __clang__
typedef R (^bfun)(P1,P2);
typedef void (^bproc)(R,P1,P2);
#endif
};

and in Felix:

type cproc3[R,P1] = "typename _type3<?1,?2,?3>::cproc";
type cfun3[R,P1] = "typename _type3<?1,?2,?3>::cfun";
type bproc3[R,P1] = "typename _type3<?1,?2,?3>::bproc";
type bfun3[R,P1] = "typename _type3<?1,?2,?3>::bfun”;

The integer specifies the number of type parameters and ranges from 1-5
at the moment. Note that the Felix bindings are NOT the same type as
for example:

P1 * P2 —> R

which is a C function type in Felix. The underlying C types ARE the same
but Felix doesn’t know that.

The reason for these is that Felix generates C code for a variable or
parameter like this:

T name;

This works for T = int, and T = int *, but it doesn’t work if T is a function type.
Or Clang block type. Or an array type for that matter, or C++ pointer to member,
or any of the other idiotic syntaxes used for type in C. The templates bring the
syntax back to something halfway sane.

The point is when you use a Felix type bound to a C++ type, the compiler cannot
“see into” the quoted text in the definition. So it HAS to be a syntactically “atomic”
type expression.

So here is the old code again:

////////////////

header block0_typedef = """
template<class T>
struct block0 {
typedef T (^type) (void);
};
""";
type block0_t [R] = "block0<?1>::type" requires block0_typedef;
fun invoke0[R] : block0_t[R] -> R = "$1()";

var x = 42;
var f = cexpr [block0_t[int]] "^int { return $1; }" (x) endcexpr;

println$ f.invoke0;
x = 99;
println$ f.invoke0;

/////////////////

and here is the same thing using the new typenames:

///////////
fun invoke0[R] : bfun1[R]->R = "$1 ()";

var x = 42;
var f = cexpr [bfun1[int]] "^int { return $1; }" (x) endcexpr;

println$ f.invoke0;
x = 99;
println$ f.invoke0;
///////////

We got rid of the boilerplate crud by standardising it.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 22, 2020, 10:21:20 AM9/22/20

to felix google, Gordon Childs

It is necessary due to ObjC protocols to introduce intersection types.
I want to explain what they mean to myself mainly so I know how to implement them.

But first the motivation. Felix implements protocols and Objc instance types with inheritance
in a very clever way using the existing subtyping rules. Felix allows you to say a nominal
type U is a subtype of another type P. For stuctural types there are fixed rules built into
the unification engine.

The common example of subtyping which is at first confusing and then easy to understand
is record types, so we will start there. Suppose you have a function that requires a record
with fields f and g for example:

fun pr(x:(f:int, g:int)) => x.f.str + “,” + x.g.str;

Now, it is safe to call

f (f=1,g=2,h=3)

even though the argument is the wrong type, because it has the fields f and g
which function pr uses, the extra field can’t hurt. Because of this understanding
the argument type here isA value of the type of the parameter x, in other words,
its a subtype. The way we implement it is as follows:

First, the unification engine says “passing the argument is OK because its type
is a subtype of the parameter”.

Next, when we are binding the application we notice the type of the argument is
wrong, but since unification says its OK anyhow we COERCE the argument to
the type of the parameter so now the coerced argument is exactly the right type:

f ((f=1, g=2, h=3) :>> (f:int. y:int))

The coercion throws out the h field from the type, but what happens at run time?
In OO systems and FPLs with boxing everything is a pointer and probably nothing
happens. But in Felix, a record is a struct with the fields organised in alphabetical order.
So the coercion actually creates a new record of the correct type by filling in the required
fields from the argument and dropping others. The point is in Felix, coercions often
DO SOMETHING, they are not just casting the type to some other type.

For nominal types, the user simultaneously species both the subtyping rule
and the coercion operation in one statement:

supertype (x:Super): Sub => ….;

or if they’re primitive bindings

supertype : Super : Sub = “…. what to do here .. “;

So the lesson so far: for records, adding fields creates a subtype.

What about ObjC protocols?

Same lesson. Adding methods creates a subtype.

Now look at this Felix binding:

@interface X<A,B> …

What this says is that class X instances have all the methods of protocol A and
all the methods of protocol B as well as any other method specified.

In other words, X is a subtype of A and also a subtype of B. In Felix A, B are
nominal types generated by the binding but they’re PHANTOM types in the sense
that no value has exactly those ypes. X values have each of those types as well
as X, so subtypes of A and B but are not exactly A or B usually.

In ObjC class instance values are just pointers, and a protocol is underneath
is just some void* which is CAST to a type in the compiler to allow overloading
to find methods defined with that type as an argument.

But now theres a problem. A function that needs both A and B, what is the
parameter type?? What ever it is, X is has to be a subtype of it!

The answer is an intersection type:

A & B

Remember ADDING methods or fields makes a type SMALLER not bigger.
There are LESS values that have the additional methods than there are
without them. So intersection is ADDING methods because having a method
is a constraint, restriction, refinement or just plain smallifying operator :-)

The unification engine has the rule:

X < A & B iff X < A and X < B

in other words the argument type must be a subtype of BOTH the types
of a parameter whose type is an intersection of two types.

The problem is, how do we do overloading and what is the coercion requires
to force the argument to the right type?

The answer is obvious now. What we do is try to overload on the first type
of the intersection. If that succeeds, we coerce the argument to that type.
Otherwise we try the second type.

I want to note now that Felix is going to do it in order to even though
intuitiively & is symmetric:

A & B = B & A

in Felix this is NOT going to be true (although I could implement that
doing things in sequence is easier than in parallel :-)

It matters. Consider the intersection of two record type with the SAME
field. What happens? Is it ambiguous or what?

In Felix, GIVEN ordering, its no problem because records allow duplicate
fields. The duplicates do NOT merge, the intersection record has BOTH.
But one must be found first so ordering matters.

OK so whatr if we have a tuple containing TWO intersections each of two
types? Then we must consider all 4 possible combinations. That part is easy,
and the order of the loop through each intersection just got specified,
but which one do we loop through first??

And how do we tell??? Its easy to find an intersection type, but what if
the argument type is:

P * (Q & Y)

The answer in this case is obvious. This is exactly the same as

(P * Q) & (P * Y)

that is, intersection distributes over product formation. And clearly
over sum types too. The rule for function is a bit trickier :-)

The point is that we HAVE to expand the types so the intersections
are always at the top otherwise we cannot simply run through the
intersection components trying to overloads.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 22, 2020, 11:10:26 PM9/22/20

to felix google, Gordon Childs

YeHA!

A simple case of intersection types is now working. There’s more to do but
this is really nice:

//////////////////////////////////////
objc-bind
@protocol hasDescription
-(NSString)description;

@end
requires small_class_interface, small_class_implementation, package "foundation", package "objc"

;

objc-bind
@protocol Cpy
-(instancetype)cpy;
-(instancetype)cpywithmsg:(NSString);
@end
;

objc-bind
@interface SmallClass<hasDescription, Cpy>
{
int x;
y: int;
}
+(SmallClass)alloc;
-(instancetype)init;
-(int)get1977;
-(int)getsum:(int);
@property int z;
@property (readonly) q:int;
@end

;

var sc : SmallClass = #SmallClass'alloc.init;

// Covariance
var sc2 : SmallClass = sc.cpy;
println$ "sc2.z=" + sc2.z.str;
var sc3 : SmallClass = sc.cpywithmsg' @"Copy with message";
println$ "sc3.z=" + sc3.z.str;

// Intersection ******** LOOK HERE *************************
fun dc (x: hasDescription & Cpy) => x.cpy.description.str;
println$ sc.dc;
//////////////////////
sc2.z=44
2020-09-23 12:47:17.517 flx_run[22121:5639250] Copy with message
sc3.z=44
SmallClassInstance
//////////////////////

The function dc simply requires its argument to support both hasDescription and
Cpy protocols, which SmallClass does. The test first calls the cpy method
of the Cpy protocol, then calls the description method from the hasDescription
protocol on that.

Now do not be confused it is clear why it works at the Objective C level.
That’s nothing interesting. The critical thing is that we have two features:

1. covariant instance types
2. anonymous (structural) intersection types

which enable full static type checking with correct method selection overloading
on a value of intersection type. Felix looks for the method in each intersectand
type: it performs TWO overloads and merges the results.

Intersections should work for records too, however protocols are just void* underneath
which is fine because all ObjC class instance types are just machine pointers.
Record values on the other hand are structs and subtyping coercions actually
work by copying fields to construct a new value. Function coercions are even
trickier.

As well as this the implementation is not complete. It relies on the intersection
being at the top level of any type. If an intersection is buried in the type,
for example a component of a tuple, it won’t work.

The fix for this is simple: all nested intersections will be top leveled
by every type constructor to ensure the normal form (intersections
are always at the top level). there are a few other normalisation rules:

* an empty intersection is void
* a one component intersection is just the component

The main normalisation rule is problematic! Consider

(A&B) -> C

Do we really want this type to be

(A->C) & (B->C)

??

in addition the loss of principal typing means types with intersections may not
always work when they contain type variables. The binding of a subtype argument
to an intersection parameter should always work though: that case supports
principal typing.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 23, 2020, 8:03:57 PM9/23/20

to felix google, Gordon Childs

Just to be clear on the unification issue. In the Ocaml code there is a routine “ge p a” that measures if
a parameter is a supertype of an argument and provides the most general unifier (MGU) which makes
this so. The MGU is a list of assignments to dependent type variables which makes this so. Actually the
routine requires a set D as well which lists the dependent type variables we are trying to solve for,
and takes a list of inequalities. If an inequality is not appropriate the routine tries for equality. The code
is found in src/compiler/flx_core/flx_unify.ml.

For example here is part of ge:

| BTYP_array (l, BTYP_unitsum n), BTYP_tuple rs ->
if List.length rs <> n then raise Not_found;
List.iter (fun r -> add_ge(l,r)) rs

This examines if an array parameter type l^n is a supertype of a tuple argument
with component list rs. If the length of the array and tuple don’t agree Not_found
is thrown indicating unification is impossible. Oherwise we add new inequalities
to solve, l must be supertype of each r in the list rs. Note it is MORE than just that,
we are looking for a set of assignments to type variables which make this true
for all such r *simulateously*.

There is another rule:

| BTYP_tuple ls, BTYP_array (r,BTYP_unitsum n) ->
if List.length ls <> n then raise Not_found;
List.iter (fun l -> add_ge(l,r)) ls

which handles the case where the parameter is a tuple and the argument is a unitsum.
We also have more cases:

| BTYP_tuple ls, BTYP_tuple rs ->
if List.length rs <> n then raise Not_found;
List.iter (fun r -> add_ge(l,r)) rs

| BTYP_array (l, BTYP_unitsum n), BTYP_array (r, BTYP_unitsum m) ->
if m <> n then raise Not_found;
add_ge (l,r)

If you have follows some of the compact linear type stuff you will recognise the array
cases ARE NOT FULLY GENERAL. The algorithm cannot handle array exponents
which are compact linear but not unitsums. IT SHOULD.

Now, I am showing you this to show you how it works. For intersections:

| BTYP_intersect ts, t ->
List.iter (fun p -> add_ge (p, t)) ts

This says that the argument type t must be a subtype of the parameter type p,
for every p which is an intersectand of the paramater. We have to set the type
variables that make this so simultaneously. This is the standard subtyping rule.

Here’s the second case:

| t,BTYP_intersect ts ->
raise Not_found

WOOPS!!! There are some things to note. First, Ocaml processes pattern matches
in order so the parameter type t here CANNOT be an intersection because that
was already covered by the first rule. Also other cases are excluded because
the type constructors enforce normal forms. The ts list cannot be empty or contain
only one intersectand.

Never the less this case is obviously solvable:

int >= int & int

with an MGU containing no type variable assignments. And this one is too:

P >= A & A

where P is an independent (universally quantified) type variable, with the MGU

P <- A // assign A to P

because A & A = A. The rule for a JUDGEMENT here is simple: if ANY of the RHS intersectands
is a subtype of t, the intersection must be too. The intuition is: if one of the intersectands is a subset
of t, then intersecting it by anything can only make it even smaller, the result must remain
a subtype. Notice that this is sufficient but not necessary, it can still be that no intersectand
is a subtype, but the intersection is. This cannot be solved by structural comparisons however,
which is all unification engines can do.

There is a way to use unification to make this judgement. We only need to solve ONE
of the inequalities to be be happy. We cannot add to the list of equations to be solved
simultaneously, what we do instead is call unification RECURSIVELY for each component.
If at least one such call finds a unifier, the subtyping condition is satisfied.

THE PROBLEM IS THERE COULD BE MORE THAN ONE SUCH UNIFIER.
What do we do? Pick one? Unification is supposed to find the MOST GENERAL unifier.
And in this case there isn’t one.

Now I need to explain, why we MUST have the most general unifier. What we are actually
doing is taking the application of a polymorphic function, that is, a template, and specialising
it by fixing all the type variables so it handles a particular argument. Usually, the argument
can only contain existential type variables, that is, unknown types but ones which
cannot vary and are NOT dependent variables the unification is trying to set.
they look the same syntactically but they’re effectively constant types like “int”.
This happens when a polymorphic function is nested in another one, we’re overloading
the inner function and the overloads on the outerone are not relevant so its type
variables, universal when solving for it, are existential inside its body. This is something
type theorists never seem to understand.

But there is one special case: recursion. If a function calls itself we have to simultaneously
set the type variables for the outer (non-recursive) call to the function and the inner
recursion. We use alpha conversion to ensure we do not get confused about which
variables are dependent and which not in the inner call. But a solution requires
the most general unifier otherwise we might end up with polymorphic recursion
which we want to avoid: polymorphic recursion only works for polymorphic functions
and Felix has no polymorphic types (it uses type schema, so only functions can
be polymorphic).

In ny case, finding the most general unifier is the only way to allow successive
refinement. If we didn’t use the most general case subsequent refinements
which would otherwise succeed could fail.

But it cannot be done for the case above EVEN THOUGH WE CAN JUDGE
THE SUBTYPING IS SOUND.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 24, 2020, 1:06:27 PM9/24/20

to felix google, Gordon Childs

I’m still struggling with unification. So lets start with an example

K > T & U

Now i’ve been thing about this wrong. We could set T to K, and U to anything bigger than K such as W,
the universal type OR U to K, and T to say W. So there are two unifiers neither more general than the other.

But this case simply cannot happen! The type variables in an inequality are always on the LHS.
We do not need to track which variables are dependent. We do this only to simplify equalities.
For example the cases

T^N = U * U

and be solved for the LHS by T=U, N=2. If we swap the order

U * U = T ^ N

we get the same solution: its an equality, PROVIDED we’re hunting for T and N. So by keeping
track of the dependent variables we reduce the number of pattern matches needed by half.

But in Felix inequalities

P > A

are ALWAYS trying to find substitutions for type variables of a function occuring in a parameter.
There are no variables in an argument that need setting (except in the type recursion case where
a variable ends up on both sides and that’s a mu variable defining the recursion which has
nothing to do with unification, it just happens to drop out of solving equations).

Felix does sometimes test for equivalence like this:

A > B and B > A

however. For example if f:A->B and x:C, and we see f x, Felix checks if A = C.
If not Felix inserts a coercion: f (x:>>A) to force the argument to the function domain.
Equality of intersections is really hard. But actually we can do this without caring
about a unifier. We could just insert the coercion anyhow. After monomorphisation
the coercion can be removed if the types are equal and that is easy to judge.

The problem case is still

K > T & U

but now T and U are monomorphic so we’re only trying to find substitutions for type variables
in the K term.

This looks hard but it isn’t! On the RHS there are only a very limited number of possible cases!

If the RHS terms are simply monomorphic nominal types, then:

if T > U then T & U = U, so K can be matched against a single non-intersection
which we can already do. I neither is a subtype of the other, the intersection is
irreducible and so we can just set K = T & U.

Since we use nominal phantoms for method protocols, the answer for records
must be the same. But the thing is we can actually CALCULATE the intersection.
We just collect all the mentioned field, and intersect the result with any type variables
which are intersectands. There MUST become records on monomorphisation
or the intersection is void.

So actually there are only three kinds of base intersectands:
nominal types related by the subtyping graph the user supplied,
and records. Any other intersectand type has a shape, such as a tuple,
in which some component could be a base type (record or nominal type).
in this case ALL the intersectands must have this shape, since otherwise
the intersection is void.

In this case we perform a contraction, that is, we make a single term
of that shape, with a nested intersectand. Now we apply the rest of the algo.

For example:

(int * A) & (int * B) —> (int * (A & B))

I need to consider function types as well. But other types simply cannot be
intersected structurally, only their components can be.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Sep 26, 2020, 8:21:24 AM9/26/20

to felix google

This works:

~/felix>cat tmp.flx
struct X { x: int; y: double; };
var x = X (y=2.3,x=1);
println$ x._strr;

~/felix>flx tmp
X {x=1,y=2.3}

It works for cstructs too. The demo shows you can provide a record argument to a struct
constructor, where the record has the same field names and field types as the struct.

But this doesn’t work and I can’t figure out why:

typedef AudioFormatFlags = uint32;
typedef AudioFormatID = uint32;
cstruct AudioStreamBasicDescription {
mBitsPerChannel : uint32;
mBytesPerFrame : uint32;
mBytesPerPacket : uint32;
mChannelsPerFrame : uint32;
mFormatFlags : AudioFormatFlags;
mFormatID : AudioFormatID;
mFramesPerPacket : uint32;
mReserved : uint32;
mSampleRate : double;
};

// FIXME: cstruct should accept a record
var asbd = AudioStreamBasicDescription (
mSampleRate = 44100.0,
mFormatID = kAudioFormatLinearPCM,
mFormatFlags = kAudioFormatFlagIsFloat,
mBytesPerPacket = elementSize,
mFramesPerPacket = 1u32,
mBytesPerFrame = elementSize,
mChannelsPerFrame = 1u32,
mBitsPerChannel = 8u32 * elementSize,
mReserved = 0u32);

If I make an uninitialised var, and assign the components, there are no errors.
However subtyping could interfere so that doesn’t prove it should work.

The compiler says the record has type:

record(
mBitsPerChannel:(uint32),
mBytesPerFrame:(uint32),
mBytesPerPacket:(uint32),
mChannelsPerFrame:(uint32),
mFormatFlags:(uint32),
mFormatID:(uint32),
mFramesPerPacket:(uint32),
mReserved:(uint32),
mSampleRate:(double))

which is correct and the field names are all correct too as far as I can tell.
the constructor argument type is:

uint32 * uint32 * uint32 * uint32 * uint32 * uint32 * uint32 * uint32 * double

however it really should accept a record also.

in fact, the responsible routine is in src/compiler/flx_bind/flx_reorder.ml and it
actually matches parameter names with the record argument.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

blu...@fi.muni.cz

unread,

Nov 3, 2020, 7:00:34 PM11/3/20

unread,

Dec 10, 2020, 11:56:35 PM12/10/20

to felix google

So, our problem is to create a real time thread handling circuits.
To do this, we may use a non-Felix thread so it doesn’t have to stop on GC.

For this to work correctly, either

(1) none of the components are GC allocated,
and don’t contain strong pointers to GC objects,

OR

(2) All the objects are rooted. The reason for this condition is that
the paths between fibres, coroutines, and channels wiil be dynamically
changing whilst the GC is running because the RT thread does not stop.

Although I think (1) is the best solution it is harder to implement and
subject to programmer errors; most chips are created by forming closures
over the pins, so reachability matters a lot.

The requirement for RT signal processing is that all the coroutines run
forever and do not spawn any objects on the GC heap; this is the case
for most signal processing systems which, in order to be real time in
the first place must consist of connected finite state machines only.

So my thought for (2) is to make a modification of create circuit command
which somehow returns a frame object containing pointers to all the
fibres and channels created.

All the fibres will be

(a) pointed at by the scheduler
(b) pointed at by schannels
(c) held by async I/O operations (typically an alarm clock)

The set of fibre is also fixed. However the fibres in the scheduler
run pointer, active list change with channel I/O which is protected
by a spinlock invisible to the GC.

So we really need to root the whole set. My thought is to use a procedure
object to spawn the fibres, magicallyt get the fibre pointers into that object
as well, and store the data frame of the procedure in a reachable place
nominated by the user; this ensures all the objects created are always
reachable, no even if the paths between objects are changing whilst the
GC is running. Hopefully the GC algo will still terminate .. :-)

When the client wants to kill the signal processor, they will set a flag which
will stop some of the loops in components and cause the processing
framework to collapse, so the scheduler will finally return. However now
the client must NULL out the pointer to the object holding everything
to make sure it is reachable.

i had a look at the compiler and it spawns the fibres in a sequence in the
current scheduler. It somehow need to use “spawn_and_return_fibre”
to get the fibre pointers and save them. The channels seem to be created
in the current space which seems wrong (they should be created in a procedure
which binds everything then returns, so the channels are forgotten, otherwise
the circuit won’t terminate correctly). Any which way, it should be possible
to collect the channels into a frame.

So basically PART of the issue can be solved by adding an enhancement
which allows collecting the created fibres and channels into an object a pointer
to which is returned and saved, and, to allow the GC to resume tracing when the
thread is stopped, this pointer is just forgotten (perhaps by NULLing it out or
just forgetting it .. )

This will allow circuits to run in places the GC cannot see directly, this
includes foreign RT threads, but also it means they can be run by foreign
C++ code as well, Specifically, since “run” is just a subroutine a foreign
API calling it can be passed a Felix created scheduler so foreign
C++ can just call it like any other subroutine.

None of this solve the problem of creating and managing the RT aspects
of the RT pthread; but it does allow that thread to run continuously
even when the GC is running, without need for any world stop.

—
John Skaller
ska...@internode.on.net

Garrett Bluma

unread,

Aug 11, 2021, 3:56:29 PM8/11/21

to felix-l...@googlegroups.com

Remind me again (memory is a bit fuzzy on it), the felix channel logic doesn't use the stack, so stack swapping isn't an issue either, right? If we looked at the assembly we wouldn't see push/pop and ret calls for the stack pointer (at the chip level), it would just look like a loop with short jumps between the chip segments, right?

John Skaller2

unread,

Aug 12, 2021, 1:03:57 AM8/12/21

to felix-l...@googlegroups.com

> On 12 Aug 2021, at 05:56, Garrett Bluma <garret...@gmail.com> wrote:
>
> Remind me again (memory is a bit fuzzy on it), the felix channel logic doesn't use the stack, so stack swapping isn't an issue either, right? If we looked at the assembly we wouldn't see push/pop and ret calls for the stack pointer (at the chip level), it would just look like a loop with short jumps between the chip segments, right?

if it were optimised. But in the dynamic form a service call works by:

a) put a pointer to the request in the continuation object in a reserved place
b) return this (pointer to current object)

Then, the scheduler, after the service is done, will resume it by something like:

while(ptr) ptr = ptr->resume();

This is why the machine stack must be empty when doing a service call, more precisely,
saying “return this” will indeed empty it back to the stack the scheduler had. So all the
“local variables” have to be non-static members if they’re to be preserved.

Felix has no idea what a coroutine is. It knows only service calls, they’re
a virtual machine instruction.

If the compiler (which the nurtl does not use yet) were taught about coroutines,
you’re right, one could just jump from one section of one coroutine to another.
It all depends if the place to jump is statically known, in a variable, or whatever.
With the scheduler, it’s dynamic.

It should still be pretty fast.

The new allocators are much faster allocating than the Felix GC, and there’s
no world stop, but there are design issues avoid cycles.

The channel I/O is slower because there are two levels of reference counting
involved to get the termination semantics correct.

So the “real time” system may not be as fast overall but latency should be bounded.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Aug 12, 2021, 1:38:12 AM8/12/21

to felix google

Some tweaks.

i noticed the “chips” don’t work at all like other continuations. They’re infinite loops so do
not need a _caller variable, which has to be artificially set to nullptr. And they’re
spawned, not called.

So I think callable objects like *subroutines* which can return are distinct
from *coroutines* which do not.

Of course we’re using continuation passing in both cases, but differently.

in principle a *routine* has arguments, but if that includes a continuation
it should be passed explicitly as an argument. A subroutine is then a special
case where the passing of the “caller” current continuation is more “implicit”.

At the moment the caller has to be passed explicitly,

—
John Skaller
ska...@internode.on.net

Garrett Bluma

unread,

Aug 12, 2021, 11:53:14 AM8/12/21

to felix-l...@googlegroups.com

Ahh. Forgot about fibre scheduler and coroutine pointers. Thanks.

John Skaller2

unread,

NO NO NO! This is a REAL ERROR! Yea!!!!

println$ 'Copying platform macro ..';
for pattern in fmacs perform
CopyFiles::copyfiles(cmd.repo/'src'/'config', pattern,
cmd.target_dir/cmd.target_bin/'lib'/'plat'/'${1}',true,cmd.debug);

…..

// Felix platform macro

var fmacs = ([cmd.os+'/([^/]*\\.flxh)']);

See? The variable really is assigned AFTER use!

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Nov 21, 2021, 12:59:22 PM11/21/21

to felix google

Ah, I have tracked down this one:

var clock = Faio::mk_alarm_clock();

var e : SDL_Event;
// dummy first event
e&.type <- SDL_FIRSTEVENT.uint32;

An SDL_event is a cstruct, which represents a C union.

But the problem is coding of the assignment:

(apply((prj0:RWptr(SDL_Event,[]) -> RWptr(uint32,[])), &e<64016>ref) :>> Wptr(uint32,[]))
<- (_ctor_uint32<65478>primfun SDL_FIRSTEVENT<65350>varname);

Now, there is clearly an address taken of variable e in there.
The problem is, I’m tracking fairy variables. The detector sees that there is
a projection of the pointer, and tries to say there’s a write to that.

This would be a disaster for a union. But that is not the problem!
The problem is that Felix is using a TUPLE PROJECTION on the cstruct.
Presumably it does that for structs as well. But structs are not split
into fairies at the moment (the split is done only for tuples, small arrays, and records,
because I haven’t passed in the symbol table to find the fields of the nominal types).

So, the algo is not finding a fairy for the specified projection.
Unlike records, struct projections use the field NUMBER not its name
so we use a proj projection not an rproj.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Nov 21, 2021, 4:55:40 PM11/21/21

to felix google

Arggh. So now, the use before init passes all tests and rebuilds Felix without
error so it’s time to turn it into a hard error checker.

In the process I have found (but not corrected) bugs in the uniq usage algorithm
too. The detector needs to be fixed.

Both need test cases. I have yet to figure out how to automate tests that are
supposed to fail. The output error messages are subject to change and there’s
no comprehensive set of error codes. However the compiler does have a -e switch
which causes a 0 return if there’s an error or 1 if there isn’t, designed for an
error/no error detection.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Nov 21, 2021, 4:59:36 PM11/21/21

to felix google

Except for this weirdo ..

tools/flx_build_prep -> flx_build_prep

Share init error: Using uninitialised shared variable
(82869:->fmacs)
Variable fmacs defined at
/Users/skaller/felix/src/packages/buildtools.fdoc: line 778, cols 7 to 48

777: fpcs += cmd.os/'([^/]*\\.fpc)';
778:
******************************************
779: // setup header files to copy

In instruction xs<78278> := fmacs<60928>varname;
Detected at/Users/skaller/felix/src/packages/lists.fdoc: line 723 col 5 to line 730 col 6
722: //$ Convert a list to a stream.
723: gen iterator (var xs:list[T]) () = {
724: while true do
725: match xs with
726: | Snoc(t,h) => xs = t; yield Some h;
727: | #Empty => return None[T];
728: endmatch;
729: done
730: }
731: }

In function with code
2497 instructions, too many to list (limit 20)

Now, there was an error here before (a real one) I fixed. But the fix doesn’t
seem to have taken. Note the bad variable is fmacs, but the source is showing
fpcs. I did a make extract. This shouldn’t be possible..

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Nov 24, 2021, 9:24:25 AM11/24/21

to felix google

So the use before init routine now runs on both pre-monomorphisation polymorphic code
and also optimised monomorphic code without barfing, however some hacks like this:

C_hack::ignore(&x);

are required in some cases to fool the algorithm into believing the variable
x shown is initialised. Not all the code is hacked so some cases will arise
later probably. In some case, you should just do an actual initialisation even if
one isn’t required.

There are some cases I don’t understand. One of them is

tools/flx_build_prep -> flx_build_prep
Share init error: Using uninitialised shared variable
(82869:->fmacs)
Variable fmacs defined at
/Users/skaller/felix/src/packages/buildtools.fdoc: line 778, cols 7 to 48
777: fpcs += cmd.os/'([^/]*\\.fpc)';
778:
******************************************
779: // setup header files to copy

In instruction xs<78278> := fmacs<60928>varname;
Detected at/Users/skaller/felix/src/packages/lists.fdoc: line 723 col 5 to line 730 col 6
722: //$ Convert a list to a stream.
723: gen iterator (var xs:list[T]) () = {
724: while true do
725: match xs with
726: | Snoc(t,h) => xs = t; yield Some h;
727: | #Empty => return None[T];
728: endmatch;
729: done
730: }

731: }while true do
done

this is just nonsense.

The main problem is when dynamically something is assigned, but statically
there’s a path where it isn’t. For example

for var i in 0..9 perform &x.i <- 1;

initialises an array x, but the algo cannot see that, since i is a variable.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Nov 24, 2021, 9:27:56 AM11/24/21

to felix google

I am now removing assignment from the compiler completely.
The storeat instruction must now be used.

Currently, assignment allows an expression on the LHS.
It also adds match cases in many places in the compiler, especially
in the use before init algo. This just complicates things.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Dec 5, 2021, 8:53:11 AM12/5/21

to felix google

Felix currently supports uniqeness types using a combinator:

uniq T

and an coercion

box x: T -> uniq T

to construct one. When a boxed value is stored in a variable, the variable
is said to be live. No actions at all are allowed on a boxed variable,
other than to move the box out of it, leaving the variable in a dead state.

For example

var x = box 1;
var y = x;

At this point, variable x is dead, and variable y is live and contains the
box originally found in cx.

A box can be passed to a function, leaving the holding variable, if any,
dead:

var x = box 1;
var n = f x;

The function acquires the box, it now resides in the function’s
parameter, and the variable x is dead.

A variable must be dead when it goes out of scope otherwise
the box will be inaccessible and the resource its contents refer
to will leak. So this function is invalid:

fun f() { var x = box 1; return 1; }

because the box is lost, and so its contents are also.

In order to use the object refered to by a boxed value,
the the box has to be removed. This is done by the unbox operator.

var chp = boxed (malloc[char] 42);
fun clear (q: uniq &char, int n) {
var p = unbox q;
memset (p,char 0, n);
return box p;
}
chp = clear chp;

Here the function clear accepts the box, making chp dead,
then it unboxes it, clears the referent of the previously
wrapped up pointer, reboxes the pointer and returns it,
resulting in the variable being made live again due to the assignment.

Now we must be clear of the contracts involved. Unboxing a reference
is always safe because we OWN the reference uniquely. We can do
anything with it, including modify the referent, deleting it, or sharing it.
We can do this even in a pure function, because any effects cannot be
observed since we have been *promised* we’re the sole owner.

Passing a box to a function is NOT transparent: it changes the state
of the source variable. But the function itself remains pure.

Boxing a value is not automatically safe. A programmer boxing
a value is *asserting* sole ownership. In the function above,
the variable q is sharable. It is in fact shared (aliased) by passing
it to memset. However we know memset forgets it on returning.
So after memset is called, p is the sole alias again, and we are
justified in boxing its contents, because by the time the box
is possessed by another variable,m the function has returned
and the shared reference p is forgotten.

Note the proof here demands not only that the variable being
boxed is indeed solely owned, but it will not be used after
the boxing. A counter example:

var p = unbox q;
var r = box p; // OK
println$ p; // This is safe but ..
delete p;
return r; // WOOPS

It’s safe to do the println$ because it only BORROWS p.
So even though r is boxed, asserting its contents are unique
reference, the original unboxed sharable reference is indeed
shared, violating the contract. However the violation is benign
because println doesn’t reshare it.

Even the delete is perfectly fine, because we have not USED
the variable r yet, But when we return it, we have lost control
and it now happens to contain a dangling pointer..

THE POINT is that boxing is unsafe and the programmer
is responsible for using it correctly.

Note that using shared values is intrinsically unsafe!
We can delete an object even if we are not the sole owner.
This is why we might us refcounting or GC.

============

Now, a unique value can be safely passed to a function expecting a shared value:
the unbox operation is inserted implicitly. This is because uniq T is a subtype of T.
This also impacts nested uniques in tuples, for example.

Furthermore, a nested function can access a unique value and this is tracked
by the uniqueness checker if the nested function is explicitly called.
Note that a function can only make the parent uniq dead, since it cannot
assign to it, since functions aren’t allowed to have side effects.

Indeed this is one reason we need to think about borrowing!
But we’ll come to that in a minute. More of a problem is if a nested
function which accesses a parent unique is

(a) passed as an argument to another function
(b) returned

Note that if we did explicitly call a nested function which uses
the uniq, the uniq becomes dead, so we can only do it once
unless we relived the variable in between.

Felix creates closures by passing a pointer to the parent stack frame
to the child. So the issue is the same now as addressing a uniq.
The current system simply ignores all these cases. We don’t know
how many times a closure is called, and we don’t know what happens
to a pointer to a unique variable: is it used to read or write the variable and
if so how many times and in what order? We don’t know, without global
data flow analysis which tracks aliases.

============

Now the next issue: at present a List::map rerturns a uniq list.
So if we just say

var x = map f lst;

x is unique and cannot be shared. This breaks old code
written before we had unique. I have fixed the library to say

var x = unbox (map f lst);

to get back the old behaviour but only a few functions so far in the
library use uniq. There’s no problem if the variable is typed:

var x : list[int] = …

because the unboxing is implicit due to subtyping rules.
But its very inconvenient. At one time you could say

once x = …

and it meant the x was uniq. There was no uniq type constructor then.
We could allow:

shared x = …

to force the unboxing of uniqs, and

uniq x = …

to either require the RHS to be uniq OR to box the RHS if it is
not. in the latter case it is a contract assertion, rather than a type
constraint. So I don’t like that. I prefer it be an error to say

uniq x = 1; // ERROR

you have to say

uniq x = box 1; // OK

The idea is just that you’re expecting a function to return a uniq and should get
an error if it doesn’t.

===============================

Now FINALLY to borrowing. The idea of borrowing is simple enough:
it is a pain to move a uniq to a function, and force the function to move it
back in the return value AND the client to assign it:

var p = box ...
def var n, p = f p;

We have to use “def” here because it unpacks a tuple and then
allows the components to be assigned OR to initialise a new variable.

This is ugly. The idea of borrowing is that a box contents are copied
and put i a transparent wrapper which is passed to a function
WITHOUT making the variable dead. The function can then use
the referent provided it FORGETS it before returning.

in other words this is equivalent to making the variable dead,
then on return relivening it (without an assignment), Since
nothing in a single thread of control can happen to the variable,
unless the called function does it (since the caller is suspended),
this should be fine EXCEPT for pointer aliases (as always).

So lets think about

borrow T

type. If a parameter has this type a uniq or shared value
can be passed safely, if uniq it is NOT made dead.
Now the function has to unwrap the parameter to make
it accessible as usual but is then required to FORGET it
before returning. Mutation is still acceptable.

A borrowed value can be passed to another function.
It is accessible again after that function returns.
The function has to abide by the same rules.
So an unboxed value can NOT be safely passed:
the caller must be sure the function does the right thing.

Borrowed values can NOT be implicitly made shared!
Unlike uniq parameters we don’t own the referent.
On the other hand, a uniq value can always be loaned
safely, provided the borrower abides by the contract.

So uniq T is a subtype of borrowed T.

Again, a shared value can be passed safely where a borrowed
value is expected. So T is subtype of borrowed T.

SUBTYPE RULES:

uniq T -> T -> borrowed T

Note that by transitivity

uniq T -> borrowed T

Now we need some value operators. To access a borrowed
value, we can try “unwrap”. unlike unbox, this operation
is UNSAFE. This means it imposes a responsibility on the
unwrapper to FORGET the unwrapped value.

unwrap: borrowed T -> T

Wrapping is safe:

wrap: T -> borrowed T
lend: uniq T -> borrowed T

These operations are safe and can be implicit.

==============================

SUMMARY.
==========

The system I propose is based on the contract programming method.
Unlike systems like Rust, UNSAFE operations are intrinsic to the design.
An unsafe operation imposes a responsibility to obey the contract on
the user of the unsafe operation.

The safe operations, on the other hand, assume the provider did
the right thing.

The type system has two roles. First, it provides the safe implicit coercions
based on subtyping rules. Secondly, auxilliary algorithms must track
liveness states.

Borrowed values must be forgotten. This means they cannot be put on the heap.
As usual, pointers to variables, by either addressing or closure formation
are not tracked.

A borrowed parameter expresses a contract assertion that the borrowing
function will forget the parameter and it’s contents. For a local
variable, borrowed type implies that the variable will not be used
after the variable it borrowed from (if uniq) is killed.

There are thus two kinds of contracts: those that are enforced
by the compiler, including those that are involve implicit safe
coercions, and those that impose constraints on the programmer
to ensure the resulting system is sound.

UNLIKE other systems, the type system does NOT ensure soundness.
Rather, it partitions the responsibility, so that it is responsible for
enforcement of parts of a contract, and the programmer the other parts.

The resulting system is SIMPLE and the programmer responsibilities
MANIFEST. The programmer should be able to prove their part
of the contract is adhered to, so that the compiler type checks,
which prove the other parts are adhered to, result in a sound system.

All this with a caveat that the compiler type system cannot track
aliases. Explicit addressing is MANIFEST so use of the addresses
is again the programmer responsibility.

Unfortunately, CLOSURE FORMATION is NOT manifest (explicit)
and is therefore difficult for both the compiler and programmer
to account for.

Still .. something is better than nothing. We do not support
throwing out expressivity just because the resulting system
allows coding which is hard to prove correct. Restrictions just lead
to work around which are even harder to prove correct.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Dec 6, 2021, 12:10:33 AM12/6/21

to felix google

So this now works as expected …

/////////////////
// check borrowing
proc check() {
// explicit wrapping
fun f(x: _borrowed int) : int => unwrap x;
var x = wrap 1;
println$ f x;
println$ f x;

// implicit wrapping of shared
var y = 1;
println$ f y;
println$ f y;

// implicit re-wrapping of uniq
var z = box 1;
println$ f z;
//println$ f z;
}
check();
/////////////////

There are two problems I found. If we remove the last comment,
the program fails AS EXPECTED with a uniqueness error because
I haven’t edited the uniqueness checker yet. So borrowing a uniq
still kills the variable when it shouldn’t.

The same code at the top level works: I believe this is a known bug.
Top level code should be included in an _init_ routine and checked
but the *variables* at the top level are NOT locals of the _init_
routine, but instead get put into the thread frame object.

The uniqueness checker can’t handle this at the moment.
It’s not the only routine in the compiler that doesn’t handle globals
correctly. The problem is, the routine checks every function, and looks at
the local variables of that function. But _init_ doesn’t have any.

In fact, ALL functions can get at the thread frame, but it is ignored
in all of them. This is a design issue. The problem is we have no idea
of the liveness of a global variable, except in the _init_ routine, where
it starts off dead, and then is modified by calls to other functions.
These functions are siblings of _init_, not children.

Calls to siblings are ignored because they cannot access a sisters
locals.

A solution to this is to make everything a local of _init_.
So not only are variables children of _init_ but every function
is also nested in it. But this is really bad and not tenable because
in every program, type class functions would be children of different
parents (rather than being top level).

The best solution is to BAN top level variables that are not shared.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Dec 6, 2021, 2:59:47 AM12/6/21

to felix-l...@googlegroups.com

So now this works:

proc check() {
// explicit wrapping

noinline fun f(x: _borrowed int) : int => unwrap x;

var x = wrap 1;
println$ f x;
println$ f x;

// implicit wrapping of shared
var y = 1;
println$ f y;
println$ f y;

// implicit re-wrapping of uniq
var z = box 1;
println$ f z;

println$ f z;
kill z;
}
check();

In the third case, f borrows z, so z remains live, and has to be killed
by, for example, the kill operator (which does nothing but it uses up
the variable z which kills it).

This only works if the function domain is a borrow type, it does not
currently work if its a tuple with one component being a borrow type.
That is, something like:

fun g(x:int, y: _borrow int) => ….

will not work. It should work if we called

var z = box 1;

g(1, z)
kill z;

The problem is, the argument tuple has STOLEN z. It is then
borrowed by g. Then the tuple evaporates .. and its uniq component
is not used up. So it’s an error. the kill z is also an error because
z was stolen so is already dead. To make this work we would have to do this:

g (1, loan z)

This is so inconsistent I think you should have to always say to loan:

f (loan x)

should be required. This simplifies the checking because now there’s only
ONE function which accepts a uniq without killing it (namely the loan function).

I think loan should ONLY be applied to an actual variable though.
You can box any expression, you’re boxing a value. However
you can only loan a variable. If the argument is an expression, and loan
returns a borrowed type, the original unique cannot every be used,
so is an error.

This means “loan” is similar to address taking. I just removed a “uniq”
operator that worked like that. Heh. So actually loan should be intrinsic
operator not a function. Otherwise we have to check its argument is a variable.

Hmm. But the issue is that we decided uniq t is subtype of t which is a subtype of borrowed t.
Which means the coercion should be implicit.

The funny thing is, there IS or should be at least an explicit coercion injected
by the compiler. So actually I don’t need to check a call to a function, to see
if the parameter type is borrow. If it is the uniq is already coerced to a borrow.

hmmm .. rethink ..

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Dec 6, 2021, 7:00:46 AM12/6/21

to felix google

Ok so this now works:

var a = box 1;
fun g(x:int, y: _borrowed int) : int => x + unwrap y;
println$ g (1,(loan a));
println$ g (1,(loan a));
kill a;

I had to make loan an operator. You can loan any variable.
If it is uniq T, the loan is borrowed T, if it is borrowed T, it stays borrowed T,
and if it is plain T, it becomes borrowed T.

It translates into a coercion exactly as if you had written one, for example:

loan a —> (a :>> _borrowed int)

In direct cases like

var z = box 1;
println$ f z;
println$ f z;
kill z;

the compiler already inserts a coercion, so you can but do not need to loan z
in this case. however tuple formation preserves type, so using a uniq variable
as a tuple component makes the component uniq, killing the variable.
Since functions always accept a single tuple argument, a loan is always
required if a variable is a component except for a unary tuple which in Felix
is just the original value.

I have to review handling of projections .. another email.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Dec 6, 2021, 1:21:22 PM12/6/21

typedef fun diag(Y:TYPE): TYPE * TYPE => Y, Y;

typedef fun comp(f:TYPE * TYPE->TYPE, g: TYPE->TYPE * TYPE):TYPE->TYPE =>

fun (T:TYPE):TYPE => f(g T)
;

var a : comp(pair,diag) int = (1,2);
println$ a;
///

Now, back to the point.

We can compose T->T*T with T*T -> T as shown, but the original
\circ won’t do it because the KIND of the original circ’s domain is

(TYPE -> TYPE) * (TYPE -> TYPE)

but the comp function has domain

(TYPE * TYPE -> TYPE) * (TYPE -> TYPE * TYPE)

what we actually require for composition is

(K -> J) * (L -> K)

and the result is a functor

L -> J

where J,K,L can be any categories. And of course we do this with
a kind variable (a type variable one level up that varies over kinds).

The kind variable, like a type variable, needs a constraint, for type variables
it;s something like TYPE, here it would be KIND.

KIND is not a kind. It’s a sort. :-) That’s the next level up after kinds.

Now actually, this could go on forwever, so we want to actually use TYPE,
and use context to know we really mean KIND.

Otherwise I have to do all the stuff again, unification, etc, and infinitely. :-)

So maybe this syntax:

typedef fun compose[J,K,L] (f: K->J, g :L -> K):L -> J =>
fun (t:T):T => f(g(t))
;

The kind variables could be qualified with a sort, the default is SORT,
and it’s the only sort at the moment ..though we’ll surely have SORT -> SORT popping
up and the whole story again (ad infinitium).

Note that the inner anonymous function binds to its parent context, otherwise
f,g wouldn’t be defined.

A polykinded function (wll that’s polymorphism one level up, right) has to be named.
Polymorphic functions have to be named too. Anonymous function cannot be polymorphic.
This is because there has to be a symbol table entry as the container of the children.

Felix has no way to do this:

var f = fun[T] (x:T) => x;
println$ f 1;

At least one problem here is that f is a variable and variables can’t really be polymorphic.
But more to the point, polymorphic types do not have a lambda form: for that you have
to use a functor. That is you cannot write

[T:TYPE] T * T

you can of course write:

typedef two[T:TYPE] = T * T;

but not

type[T”TYPE] (T * T)

You can, of course, if you use a type function, but otherwise you cannot have standalone
polymorhpic types.

Now the problem is a typedef like two above:

typedef two[T:TYPE] = T * T;

creates a symbol table entry two, with type parameter T and a body.
Typedef fun makes a symbol WITHOUT parameters, defined to be a lambda:

typedef fun f(…);

is the same as

typedef f = fun (…);

So a typedef entry has a slot for type variables, in the symbol table,
but we really cant steal that for kind variables because Ocaml would barf.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Dec 28, 2021, 10:13:43 PM12/28/21

to felix google

>
> So maybe this syntax:
>
> typedef fun compose[J,K,L] (f: K->J, g :L -> K):L -> J =>
> fun (t:T):T => f(g(t))
> ;
>
> The kind variables could be qualified with a sort, the default is SORT,
> and it’s the only sort at the moment ..though we’ll surely have SORT -> SORT popping
> up and the whole story again (ad infinitium).

So I now have a fully general concept for this. It is pure category theory, it is trivially simple,
and it is applicable to all high level programming languages including Felix, Haskall and Ocaml.

We will start with a basic purely functional model because it’s easier. There are many extensions.

Initially, we have values of some primitive types, I will use programmer notation rathe than
academic notation because it is easier to write in an email. We have some kind of expression:

x + 1

and an environment:

val x: int = 2;

which allows an actual evaluation. We use lambda abstraction to *move* the environment
into the code:

(fun (x:int) => x + 1) (2)

and beta-reduction to evaluate it:

2 + 1

and then builtin operations to actually do the calculation:

3

Apart from the primitives, we will use to combinators:

(x,y,1,4) // tuple value
int * int * int * int // tuple type
int * int -> int // function type
fun (x:int, y:int) : int => x + y // function constructor

The idea is all these forms are LEVEL INDEPENDENT but the primitives are LEVEL DEPENDENT.
In other words:

(int, int)

is a valid term, it’s a type tuple. There’s an issue with

()

What level is it? Let’s leave that for a moment. The rules are that * and -> must involve
terms at the same level. Now, our function has abstracted values but there is a requirement,
the parameters of the abstraction must be typed. And our function is, well BORING.

It provides reusability .. and please note this is the HOLY GRAIL. It is the entire point
of PL design. It means increased chance of correctness, and increased optimisation
opportunities and most importantly, the programmer has more time to make coffee.

Lets make our function MORE REUSABLE and LESS BORING:

fun (T:TYPE, U:TYPE) (x: T, y:U): U * T => y,x

Now, our function is parametrically polymorphic. It works for any types.
To do this we added another lambda binder, one level up: a binder which
generalises the types.

But what is that “TYPE” thing?? Well that’s a meta type, or kind. The rule is,

ALL VARIABLES AT LEVEL N MUST BE TYPED AT LEVEL N+1

Now note, the notations here are level independent so we can also write
type functions, not just ordinary functions:

fun (T:TYPE, U:TYPE):TYPE => U * T

Note this has the kind

TYPE * TYPE -> TYPE

Our notation is going to get unreadable unless factored so I’m going to allow this:

let name:expr<n+1> = expr<n>

and also allow the classifier on the LHS to be dropped, for example:

define tswap: TYPE * TYPE => TYPE = fun (T:TYPE, U:TYPE): TYPE => U * T

so we can more conveniently write

tswap (int, string)

and perhaps

let .. in ..

This is just sugar (for the moment). We’re putting back the environment we abstracted away
with the lambas and reified with beta reduction.

Now I ran into this one in Felix. I want to define composition of any (suitable) data functors:

define compose =
fun (J:KIND,K:KIND,L:KIND)

(f: K->J, g :L -> K):L -> J

=> fun (T:L) (t:T):T => f(g(t))

This is saying if you have a data functor g from L to K, and a following functor
f from K to J, you can compose them to get a functor from L to J which we
can define by substitution effected by beta reduction using eta-expansion.

Notice two things. First, KIND is a sort, and what we’re saying is J,K,L can be
ANY category, that is, any type level expression such as

TYPE * TYPE

This is the whole point of the kind variables J,K,L, that we allow say bifunctors like `tswap`
to be composed. Here is something we could compose with:

fun diag (T:TYPE):TYPE * TYPE => T,T

the diagonal functor (aka “copy” or “dup”).

Notice also, in the implementation, the eta-expanded variable t, must not only have
a type T, but, since we introduce type variable T, we have to give it a kind too,
namely L. Of course, we can us compose like this:

val x : compose (TYPE, TYPE * TYPE, TYPE) (tswap, diag) (int) = 1;

I think i got that right :)

Notice, the pair (tswap, diag) is a pair of *functors*.

There is no reason to stop at SORT level. Although I can’t think of a use for meta-sorts
at the moment, However one of the critical contentions here is that you can write
general formulas which are level independent. So they apply at whatever level you like.
We can certainly eliminate lower levels, for example, tswap is s pure type function, it
does not metion any concrete types or values. So at the moment I think you can
define a function from any higher level down to any lower level.

The level of a term or sub-term has to be defined by the primitives involved,
for example

1, int, TYPE, KIND

There are issues with levelless terms like ().

In general some annotations can be defaulted away, and some can be deduced. For example

val x: compose (tswap, diag) (int) = 1

should be enough, because the kinds of tswap and diag are recorded in the environment.
This makes overloading possibility.

SUBCATEGORIES
===============

Subtyping is an essential part of some languages. It is generally considered to
be some kind of “is a” notion. In set theoretic terms we want an embedding,
and category theoretic terms probably at least a monic operator.

Felix REQUIRES not only sub-typing but also sub-kinding. Here is the scenario:

In Felix, we have arrays notated:

T ^ N

As you know, and array is an exponential of the same general kind as a function:

N -> T

However they have different implementations. Functions are coded by composition
of other functions. Arrays are define element by element. In some sense they’re
intrinsically functions with a finite domain.

Now, usually we think of arrays as linear and N is just some integer constant,
but NOT in Felix. In Felix, the N there is a type:

int ^ 3

is idential to

int * int * int

in Felix, however 3 is identical to

1 + 1 + 1 = unit + unit + unit

which is a structurally typed standard enumeration with 3 cases. So in the first instance:

T: TYPE, N:UNITSUM

are the kinding constraints on array types, where UNITSUM is the kind with the types

0, 1, 1+ 1, 1 + 1 + 1 …..

in it. I write 2 = 1 + 1, 3 = 1 + 1 + 1, etc. Note

bool = 2

is a UNITSUM. So now we NEED a subkind of TYPE, to specify the constraint;

define array = (T:TYPE, N:UNITSUM) => T^ N

In fact, the actual constraint is

N:COMPACTLINEAR

which is more general. A compact linear type is 0, 1, or any compact sum or compact product
of compact linear types. This formulation allows for polyadic array acces and array shapes.
The point is now we have

UNITSUM is a COMPACTLINEAR is a TYPE

So now, we we do deduction or “type checking”, which means also kind checking and sort
checking etc etc .. our unification algorithm must also handle subcategories.

Unification is actually simple, we do a standard unification, and recursively, one level
up, also uniify the kind annotations on type variables, etc ..

The thing to note is Felix has TWO product functors not one:

U * V // standard tuple, addressable components
U \* V // compact linear product

Compact linear product functor has kind

COMPACTLINEAR * COMPACTLINEAR -> COMPACTLINEAR

Of course this is just the bifunctor. The run time product formed here packs
the compact linear values into a single machine word, and projections
use division and modulus operations, instead of machine addresses.

And now, we get into really high level stuff, because we can certainly
“unpack” any compact linear product to an ordinary one. And vice
versa IF the argument type satisfies suitable constraints ..

BUT THERE IS NO POSSIBLE WAY TO ENCODE THIS
EXTREMELY HIGH LEVEL OPERATIONS WITHOUT AN EXTREMELY
HIGH LEVEL CATEGORICAL KINDING SYSTEM.

As a final word, Haskell is clearly a LOW LEVEL LANGUAGE.
It has a rudimentary and incapable kinding system. It is NOT capable
of expressing high level operations such as those described above.

I content, Haskell can be equiped with BOTH linear and uniqueness
typing, for example, but you can NOT do this without a high level
programmable kinding system. The rules for substructural logics
cannot be applied “generically” if one doesn’t know the kind of
a type variable.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Dec 29, 2021, 7:59:06 AM12/29/21

to felix google

So .. attempts to actually implement this were a disaster.

The problem is a lot of stuff assumes symbols can be qualified by type variables.
But we have also kind variables now.

At present, there is a huge mess with type aliases. The unbound term is just

SYMDEF_type_alias of type

wihich is in some symbol table, eg a class or function, with a name and type variables.
To make matters messy there are TWO entities here:

1. A mapping from a integer index to the symbol definition
2. Possible two mappings from names to integers
1. Private
2. Public

The definition is:

(** The symbol type. *)
type t = {
id:Flx_id.t;
sr:Flx_srcref.t;
vs:Flx_types.ivs_list_t;
pubmap:Flx_name_map.name_map_t;
privmap:Flx_name_map.name_map_t;
dirs:Flx_types.sdir_t list;
symdef:Flx_types.symbol_definition_t;
}

The names inside a function, eg parameters, local variables, etc are only
in the private map. Inside a class, unless marked private, names are
in both the public and private maps.

These maps are not plain name -> int. First, there are two kinds:

1. NonFunctionEntry
2. FunctionEntry

A non-function entry is for non-functions. Duh. They must be unique. You cannot
have two variables in the same scope with the same name.

type entry_set_t =
| FunctionEntry of entry_kind_t list
| NonFunctionEntry of entry_kind_t

type name_map_t = (string, entry_set_t) Hashtbl.t

The function entry maps the name to MULTIPLE functions.

Now here’s the really trcky bit:

type entry_kind_t = {
(* the function *)
base_sym: bid_t;

(* the type variables of the specialisation *)
spec_vs: Flx_kind.bvs_t;

(* types to replace the old type variables expressed in terms of the new
* ones *)
sub_ts: Flx_btype.t list
}

So what we have here is that a name maps, not directly to a symbol, but
WITH A SPECIALISATION. For example

class X[T] { fun f: T -> T; }

might be in the symbol table but you might say

open[U] X[U * U];

and now f is available with type

f: T * T -> T * T

So the current scope has an f mapped to entry kind:

base_sym = 9942, spec_vs = U, sub_ts = U * U

where 9942 is the index of the definition of f in class X in the symbol table.

The PROBLEM is that whilst, for non-parametric symbols, we just have an empty
specialisation .. for type functions, we have KIND variables not type variables.

Now, this is just the beginning. In theory, typedefs just define aliases.
So these should be “factored out” by the time we’re binding code.

Alas, no. A HUGE mess is created, because there an alias just adds
name to a structural type, so the name just gets replaced by it.

Except for recursion. Typedefs can be recursive. The problem is
that with type schema like

typedef list[T] = 1 + T * list[T]

the recursion is easy, but what if we specialise T?
I mean we can easily have

vector[list[T]]

and now the recursion is inside the type indexes of the type vector.

So it ends up I have to temporarily made a bound symbol table entry for
nominal type aliases, in order to allow for recursion, because the recursion
requires lookup, if the recursion is not self recursion:

typedef A = … B
typedef B = .. A

Recursions can be resolved, but only self-recursion can be resolved in
one step. It’s a mess.

NOW ADD KIND VARIABLES TO THE MIX.

In principle its simple.

In practice Felix lookup is indeteminate. in other words some lookups
simply cannot work. Things like

typeof (expr)

seem innocuous. But they’re extremely hard to deal with because, to find
the type of expr, we first have to BIND the expression, which involves overloading
AND it may recursively depend on itself!

var x : typeof(x) = x;

Hmmm…. :-)

--
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Jan 1, 2022, 11:00:46 PM1/1/22

to felix-l...@googlegroups.com

So I am thinking to rebuild Felix from scratch (writing code in Felix).
The main problem is the parser. Dypgen will be almost impossible to implement.

The basic idea is to use pure categories. So for example

type int : TYPE;

declared “int” to be a primitive type which is an object of category TYPE. An arrow would
be a type function:

fun swap = (x:TYPE, y:TYPE): TYPE => y * x;

For sums types:

fun first:TYPE + TYPE -> TYPE =
| case 0 of 2 (U) => U
| case 1 of 2 (V) => V
;

This is the general form. To make this work, we need builtin operators:

product (..,..,..) // infix *
coproduct (..,..,..) // infix +
exponential (A, B) // A -> B
tuple (a,b,c)

So the first two operators take 0 or more arguments. The notations are

1 = product()
0 = coproduct()
product (A,B,..) = A * B * …
corpoduct (A,B) = A + B + …

These operations are generic. A uniform product is a functor

TYPE ^ N -> TYPE

where TYPE ^ N is the product of N copies of TYPE = TYPE * TYPE * TYPE …

The N can be deduced from the number of arguments if the functor is applied.
The categories can be deduced from the argumets if it is not a uniform functor.

Obviously, if a functor has a product argument, it can be applied to a corresponding
tuple. For example:

fun swap(A:TYPE, B:TYPE):TYPE => B * A;
val x : swap(int, double) = 4.2, 1; // double * int

Compostion and application of functors are directly related (by the Yondeda lemma I think):

(f \circ g) x = f (g x);

In particular

f x = (f \circ 1_x) ()

where 1_x is the functor 1->TYPE which picks out the value x, called the
characteristic function of x. So any application seqence can be converted
to a composition sequence applied to the unit value, and, any composition
can be turned into an application by eta expansion.

This means evaluation can be performed starting at the end of a composition
chain applied to a () and applying it, then apply the next on the left to the result,
until we end up with a value. However optimisations work best on compositions
because composition is associative .. and it’s then “native” operation of category
theory.

As well as types, we have kinds. The main kind has sort KIND.

We also need to have subcategores (i.e. subtyping). As an example in Felix:

UNITSUM < COMPACTLINEAR < TYPE

These are embeddings, that is, the subtyping monomorphism is from
a category X which is a subcategory of Y, and maps each object and arrow to
itself. In CT, we would be happy with any monic, that is, an isomorphicm
from a category P to a subcategory of the target. However this requires specifying
the isomorphism (an ambedding of a subcategory in the category it is sub of does not).

When (in particular) an arrow (fun) is specified, there may be “type variables”.
This adds an extra level of lambda abstraction. For example

fun (K1: SORT, K2:SORT): KIND =:>
fun (T1: K1, T2:K2) : TYPE => T1 * T2
;

Here two different kinds (which are values of meta-sort SORT), specify
the kinds of the two type variables T1 and T2. The motivating
example is:

define compose =
fun (J:KIND,K:KIND,L:KIND)
(f: K->J, g :L -> K):L -> J
=> fun (T:L) (t:T):T => f(g(t))

What I wanted was a rule to compose any two data structures, and that’s it.

For example a map of lists of pairs has three type parameters:

compose (TYPE* TYPE, TYPE * TYPE) (map, list, pair) (int, string, double)

is a map keyed by an int, of lists of pairs of of string and double. Here, the
kind arguments can be deduced from the specified functors map, list, pair,
and indeed type checked. Er .. well kind checked. The resulting
functor has three arguments, and is applied to create an actual type.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Jan 2, 2022, 7:03:16 AM1/2/22

to felix google

One of the things I find confusing is that if you write say

type vector[T] = “vector<?1>”;

then vector is NOT a functor, it’s an indexed type. Note it’s a nominal type.
Another case:

variant opt[T] = | Some of T | None;

Again, opt is NOT a functor, its an indexed type. Polymorphic nominal types
are always indexed types.

An indexed type like “opt” represents a collection of types, that is,
it’s a type scheme. It has no stand alone validity, ity has to always
be “applied” to some indices. However you can always define a functor:

typedef fun vector_t(T:TYPE):TYPE => vector[T];

You cannot compose indexed types, however you can *substitute*:

vector[vector[int]]

which is equivalent to composition (composition and application BOTH
work by substitution).

Now, in the case of C++ types, we cannot monomorphise them in the front end.
This has to be done in the back end code generator. So, in this special case,
we must carry the monomorphised indices along to the back end with the schema.

OTOH, type functions have to be eliminated very early, before overloading in fact.
Whereas, type schema *participate* in overloading. For example

fun f[T] (a: vector[T]) => ..
var x : vector[int];
f x; // T is deduced to be “int”

In some sense, schema are “generative” in that they generate types, whereas
functors are applicative, in the sense they *calculate* types (that already exist).

Still I cannot properly explain this stuff. In particular I don’t have a cool category
theoretic explanation for the distinction.

—
John Skaller
ska...@internode.on.net

Keean Schupke

unread,

Jan 2, 2022, 8:12:39 AM1/2/22

to felix google

Hi John,

Looking at LambdaPi might help if you haven't already done so. I have an implementation in TypeScript here that runs using Deno.

https://github.com/keean/lambdapi

There are references to the original implementation paper in the code.

LambdaPi is a nice simple dependent type system, which I believe corresponds to a locally Cartesian closed category.

This paper has some more details:

https://mroman42.github.io/ctlc/ctlc.pdf

Cheers,

Keean.

John Skaller2

unread,

Jan 2, 2022, 10:59:54 AM1/2/22

to felix google

> On 3 Jan 2022, at 00:12, Keean Schupke <ke...@fry-it.com> wrote:
>
> Hi John,
>
>
> Looking at LambdaPi might help if you haven't already done so. I have an implementation in TypeScript here that runs using Deno.
>
> https://github.com/keean/lambdapi

But no examples..

>
> There are references to the original implementation paper in the code.
>
> LambdaPi is a nice simple dependent type system, which I believe corresponds to a locally Cartesian closed category.
>
> This paper has some more details:
>
> https://mroman42.github.io/ctlc/ctlc.pdf

Good paper. But as usual CT notation bends the brain :)

I’m trying to find a general syntax for functions. The actual evaluation is trivial, its just
substitution, possibly with some variables fixed using unification.

The idea is simply that when every you have a variable you have to give it a type:

fun (x:int) => …

But, we want polymorphism so:

fun (T:K) => fun (x:T) => e

You should read “fun” as “lambda:. The type variable T which types x, must itself be typed,
here with kind K. And if K itself is polymorphic, we have to announce the sort of kind variables therein.

It’s all head-normal form here. For deduction the type of e can be used to deduce T which can be used to
deduce K, maybe. The general form can start at any level like meta-meta-sortt and proceed to any lower level.

In the simplest system, the encoding can use the same unification engine at all levels, and the same
constructors (product, tuples, etc), without level annotations, provided the level is homogenous and
populated. It’s not clear what level (), the empty tuple is, or even if it matters.

Initially, I’d have no deduction. There would be some way to specify opaque entities
like the sort KIND, or the kind TYPE, or the type int. Plus subtyping rules.

Making this work in Felix is going to be very hard. It goes easily to kinded type variables
where the kinds are expressions of constants, but getting kind variables in there is
very tricky.

There’s no dependent typing involved. In general the base level of that, involving
only simple integer relations, is easy enough (see ATS2). But once you go past that,
you have to write difficult proofs as part of your program.

I have a half-belief that instead of levels, you can just do the “typing” with ANY category.
For example the category NAT, one object, an arrow for each n meaning to add n, so a
path is just a summation, can be a run time integer, a compile time integer, a type,
a kind, whatever. If you say

T: NAT

where T is a type, then NAT is a kind, but if you say

V: NAT

where V is a value, then NAT is just “unsigned int”. That’s more general than
a striated (levelled) system. Encoding it will be tricky. I’d probably have to do the
simpler system first, which will get more general by removing the repetition.

Anyhow at the moment, I could probably implement the basic calculus easily
in Ocaml or Felix, but integrating it INTO Felix is another story. The compiler
was designed to be polymorphic from the ground up (unlike other languages),
but that just means type variables not kind variables. The problem is you have
a heterogenous list of the variables at the head, with the most general, then going down
hill until you reach plain old typed function parameters (or stop earlier).

—
John Skaller
ska...@internode.on.net

Keean Schupke

unread,

Jan 2, 2022, 3:02:10 PM1/2/22

to felix google

Hi John,

Not sure if I am missing your point. The lambda calculus vending machine shows how simple it is to extend LambdaPi to embed any Pure Type System:

https://crypto.stanford.edu/~blynn/lambda/pts.html

As for general function notation, giving every argument a type (kind etc), _is_ sufficient, but many dependently typed languages prefer type annotation to allow types (kinds etc) to be omitted.

So simply value arguments have types, type arguments have kinds, and kind arguments have whatever comes after kinds :-)

The restriction is then all 'sort' names are unique (type names cannot be used for kinds etc).

There is a notation where the 'kind' of 'type' is '*' (star) and the kind of kind is '[]' (box) also written '?' for ASCII keyboards.

We don't need level annotations, and we can be polymorphic over levels when using 'type/kind' variables, although LambdaPi itself is not and we have to write:

id = \(T:*)(x:T) -> x

So the identity on integers is:

(id Int)

Keean.

John Skaller2

unread,

Jan 2, 2022, 7:31:28 PM1/2/22

to felix google

> On 3 Jan 2022, at 07:01, Keean Schupke <ke...@fry-it.com> wrote:
>
> Hi John,
>
>

> Not sure if I am missing your point. The lambda calculus vending machine shows how simple it is to extend LambdaPi to embed any Pure Type System:

I’m sure. The problem is Felix compiler is written in Ocaml and is statically typed.
The compiler make provision for type variables. There’s a combinator:

type type_t = ….
BTYP_type_variable of index * kind_t

So I need this:

type kind_t =
KIND_kind_variable of index * sort_t

if I want the kind_t term to contain kind variables. This is easy enough. The problem is
that the compiler takes parsed terms and decouples them into

name, typevariables, scopedictionary, symboltableentry,

or something like it. For a function the scope dictionary is the local symbol table
of the function. The PROBLEM is that the typevariables are type variables. ONLY.
I cannot put kind variables in that slot, because they have the wrong static type.

>
> https://crypto.stanford.edu/~blynn/lambda/pts.html
>
> As for general function notation, giving every argument a type (kind etc), _is_ sufficient, but many dependently typed languages prefer type annotation to allow types (kinds etc) to be omitted.

Sure, but the syntax, and indeed, the ability to deduce types, will come after solving the main problem,
which is to actually get kinds, sorts, etc .. into the compiler so the information propagates.

The compiler actually does this:

1. Parse to Scheme
2. Evaluate Scheme
3. Translate resulting S-expressions into raw Ocaml terms
4. Desugar the raw terms into a more systematic presentation
which separates declarations from definitions
5. Build the global symbol table and change the terms so
local name spaces (eg function locals) are presented as maps
to the global symbol table
6. Binding. Make a new symbol table in which names are replaced by
discrete keys (actually just integers). This is an extremely complex phase
because it contains the stuff that does overloading. Felix has “setwise” lookup
meaning all symbols are defined recursively within a scope
7. Monomorphisation. Remove all type variables. Instantiate type classes.
8. Optimisation….blah blah backend C++ generation.

The problem is that there are 8 DISTINCT calculi there for the same terms depending on
the compiler phase.

>
> So simply value arguments have types, type arguments have kinds, and kind arguments have whatever comes after kinds :-)

Sorts come after kinds :-)

>
> The restriction is then all 'sort' names are unique (type names cannot be used for kinds etc).
>
> There is a notation where the 'kind' of 'type' is '*' (star) and the kind of kind is '[]' (box) also written '?' for ASCII keyboards.

Haskell uses * but it is totally stupid. In Felix, these kinds already exist:

UNITSUM < COMPACTLINEAR < TYPE

A unitsum is just the a simple enumeration 1 + 1 + 1 + 1 .. etc.
A compact linear type is 0, 1, or any compact product or sum of compact linear types.
These are a subkind of TYPE. The representation is an integer. Projections
are done by division and modulus.

A compact linear type term can contain a type variable of kind COMPACTLINEAR
which means the type variable can only be instantiated with a compact linear type.
Otherwise the calculus would not work.

This is also true for pointers. Compact linear products have pointers to components,
which is just a machine pointer PLUS a divisor and modulus. So three machine words,
rather than one.

So the kinding information is really essential.

>
> We don't need level annotations, and we can be polymorphic over levels when using 'type/kind' variables,

Yes, that’s what I suspected. The primitives are at a fixed level. For example a string is a run time
string. Heck, in Felix it is defined by

type string = “::std::string”;

and is not available to the compiler (because the compiler is written in Ocaml).

> although LambdaPi itself is not and we have to write:
>
> id = \(T:*)(x:T) -> x
>
> So the identity on integers is:
>
> (id Int)

Right. The notation is only slightly different to Felix.

I will note .. there is a COMPLICATION in Felix, previously mentioned:
for nominal types, including primitives, these are NOT functors but indexed types.
You can overload on indexed types, and deduce the indices (using unification)
You cannot overfload on functors. They have to be beta-reduced away before
overloading can proceed.

I am tempted to rewrite the compiler from scratch to get the kinding system correct,
but it’s a BIG job so I’m looking for a way to retro-fit kind variables.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Jan 2, 2022, 8:11:44 PM1/2/22

to felix google

In particular the problem starts here in the grammar:

stypeparameter := sname ":" t[sarrow_pri] =># "`(,_1 ,_3)";
stypeparameter := sname =># "`(,_1 typ_none)";
stypeparameter_comma_list := sepsilon =># "()";
stypeparameter_comma_list := stypeparameter ("," stypeparameter)* =># "(cons _1 (map second _2))";

stypefun_arg := sname =># "`((,_1 typ_none))";
stypefun_arg := "(" stypeparameter_comma_list ")" =># "_2";
stypefun_args := stypefun_arg+ =># "_1”;

//$ Anonymous type function (lamda).
t[slambda_pri] := "fun" stypefun_args ":" stypeexpr "=>" stype =>#
"""
`(typ_typefun ,_sr ,_2 ,_4 ,_6)
“”;

All the terms t[priority] are non-terminals of the type sublanguage. So in the parser, we could use
the same terms exactly as kinds. There’s no kind sublanguage. So in principle, this grammar
already handles everything we need, provided we can interpret it with the right “level” of
terms when the S-expressions are translated to Ocaml.

An example:

fun (T:TYPE) (U:COMPACTLINEAR) =>
fun (K:TYPE) => T * U * K
;

However, this will not descend to actual functions, since the grammar
only allows a type expression after the => symbol.

At present, kind names like COMPACTLINEAR are built into the compiler and
cannot be user defined, there is no lookup table. Types, however can be named.

So there is a clear issue, even with the grammar, it doesn’t descend to “run time”
functions. Like this one:

fun swap[T:TYPE,U:TYPE] (x:T, y:U):U * T => y,x;

This works, but the type arguments (and they cannot be kind arguments), are indices
to a type schema, so swap is not a “functor” or whatever. We would like:

swap = fun (T:TYPE, U:TYPE) (x:T, y:U) => y,x;

and not, this is actual polymorphic function headed by a functor. The problem
is, it will not work with overloading. This works:

fun doswap[T:TYPE] (swap: T * T -> T * T, x:T, y:T) => swap (x,y);
doswap (swap, 1,2);

It deduces the type T = int. But if you use a lambda abstraction, it cannot work.
What makes it work is that swap is *indexed*. Type schemas are required for
*nominal* polymorphic types. You cannot beta-reduced into a nominal type.
Structural types use lambda abstraction because you CAN beta-reduce the
abstractions away .. which you must do *before* overload resolution.

Example of nominal type:

variant opt[T:TYPE] = Some of T | None;

—
John Skaller
ska...@internode.on.net

Keean Schupke

unread,

Jan 3, 2022, 3:53:00 AM1/3/22

to felix google

Hi John,

The thing with LambdaPi, and why it's interesting to look at the implementation, is that it internally collapses everything to one level, the value level. Type checking is simply evaluation followed by type equality.

Because you can use Higher Order Abstract Syntax (even in TypeScript where the type system is static, and definitely in Ocaml), type equality is simply quoting the value you get from evaluation (beta reduction) and comparing it to the other type. Strong normalisation means syntactic comparison is enough.

The value machinery is re-used for types, and there is no difference at all between levels, at the next level types are treated as values, kinds as types, etc.

I am not sure 'sorts' are the next level. The term 'sort' comes from logic (many sorted logics), and I believe is simply the logic equivalent of 'kind' in a type system. In type systems we have values, types and kinds, but this is probably better thought of in universes, where type levels are T[n] and U0 values:T0, U1 has T0:T1 etc.

I appreciate you have more kinds than '*', remember '*' is the kind of functions _only_. You can quite easily introduce more kinds (of type '[]').

I am not sure why you cannot overload on functors? All evaluation is beta reduction, yes you may optimise evaluation to be machine code instructions, but you are just emulating beta-reduction.

So function evaluation is beta-reduction, and you can overload on function evaluation, so why can't you overload on functors? Overloading on functors would be 'kind' overloading, not 'type' overloading obviously.

Cheers,

Keean.

John Skaller2

unread,

Jan 3, 2022, 9:14:18 AM1/3/22

to felix google

I want to implement what you have described I think, however my calculuse is a bit richer.
Lambda only have functions A -> B. I have products and sums too (and later will add arrays).
And subtyping. But the model you present appears to be what I want. I’m just not sure how to
get it into Felix. Maybe I just make kinds and sorts the same universe as types.

Actually my thought is when you write

x : T

is actually means x is an object of *category* T. Since categories are abstract, there are no intrinsic levels.
I mean, a category is just structure. Indeed, types and kinds both allow products and sums and exponentials.

The real problem is that you have to get at the value level and generate code and execute it rather than
interpret it. But this can be done by a partial evaluator (basically, instead of interpreting something expensive,
generate and run code, JIT style as an optimisation).

====

I think there’s a misunderstanding about overloading and functors. If you name a functior,
you can have many with the same name, ande chose one by overload resolution based
on the argument it is applied to. Obviously you can only do this when you have an application.
In Felix, you can chose WITHOUT an application as well, by specifying the type the argument
would have. It’s at the kind level not type level for functors but the tech is the same: we use
a specialisation engine (unification with subtyping and subsumption).

The issue I was referring to is overloading an ordinary function, when the argument is
a functor application. The problem is, a functor is just a lambda term, so you cannot
“see inside”. All you can see is the functor argument kind, usually type. So you’re trying
to overload on

(\lambda x:T => we cant see in here) (argument)

All lambda applications look the same, you can’t overload on the term above.

Now, if you *beta-reduce* the application, then you can overload on the resulting type.
But that’s useless! Consider a function overloading on vector<T>, we need the T.
But the T is gone after beta-reduction.

By comparison, consider vector as a type scheme, so it is a nominal type,
then we can pick

fun f[T] (x:vector[T])

and find T, given an argument. The reason is vector is not an alias for a type
expression, its a nominal type, and the T can’t be eliminated by beta reduction.
The point is, vector is not abstract here. In Felix:

typedef fun vector_f(T:TYPE):TYPE => vector[T];

creates a functor vector_f from the type sheme, but, it is just alias.
You can’t overload on that. I mean

fun f[T] (x: vector_f T) ….
f (vector_f int)

You cannot “match” the vector_f in the function definition with the vector_f
in the function application, because they’re just aliases for a lambda abstraction.

You can match on what is left after beta-reduction, namely, vector[int] and find
that T = int. Because the “vector” part and the T part are separated.

It’s really hard to explain. When I worked for Barry Jay, we had an example
like

list of list of int

which is a type. The functor “list” is lost. you cannot fold over this thing
because you cannot fold over a type, you fold over a functor. The question
is *which functor*?? List? Or list of list?

The point is

fold x

cannot work, because the functor has been applied to the type, and is now
lost. You have to do the fold *before* you reduce: There are two cases:

(list . list) (int)
list (list int)

where . means composition. The functor in the first case is list of list,
in the second it is just a list.

The questin Barry asked was, “where is the data”? It actually depends
on where you put the parens. There are a pair of isomorphisms which
MOVE the parens, they’re called associators.

Sorry, I’m rambling.

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Jan 6, 2022, 8:42:03 PM1/6/22

to felix-l...@googlegroups.com

So I’m having yet another go at this. The new syntax for type functions is

typefun f[K:SORT] (T:K):T => T

Anonymous type function cannot have kind variables, just as anonymous
functions cannot have type variables.

This is because these variables are associated with a symbol in the symbol table.
Taken a while to figure it out.

Now, I know the problem I will have in advance now. Ordinary variables have types.
The types in the unbound symbol table are unbound terms. However, type variables
have to already have

1. indexes
2. fully bound kinds

This was possible before, because all the kinds were compiler known
constants or combinations of them. It is no longer the case because we
have kind variables.

So in order for this to work, we must have previously indexed the kind variables
and fully bound their sort. And they must be in the symbol table ALREADY before
type variables are added, otherwise lookup when binding the kinds of type variables
cannot work.

An extra pass is needed for this. We now have lookup in kinds we didn’t have before.
It’s only for kind variables (there are no user defined kinds as yet).

—
John Skaller
ska...@internode.on.net

Keean Schupke

unread,

Jan 7, 2022, 2:40:43 AM1/7/22

to felix google

Hi John,

This makes sense to me, and would be the same for user defined kinds, all the way up (or down depending on how you look at it). In the same way I assume you could have scoped type variables defined in the same way as value variables defined with 'let' or whatever.

I assume you need forwards references, if you don't then you won't actually need a separate compiler pass. For example in LambdaPi forwards references are not allowed, hence the symbol table is simply passed as an argument to the type checking function which does a recursive descent of the abstract symbol tree, and that also deals with scoping the definitions too. There is a single symbol table for all definitions.

Cheers,

Keean.

John Skaller2

unread,

Jan 7, 2022, 8:41:55 AM1/7/22

to felix google

> On 7 Jan 2022, at 18:40, Keean Schupke <ke...@fry-it.com> wrote:
>
> Hi John

Hi. It’s half working now.

>
> This makes sense to me, and would be the same for user defined kinds, all the way up (or down depending on how you look at it). In the same way I assume you could have scoped type variables defined in the same way as value variables defined with 'let' or whatever.

Yes, but there are no user defined kinds yet.

The current code (will eventually) look like

typefun thing[K1:KIND, K2:KIND] (T1:K1, T2:K2 * K2): K1 => … ;

Here KIND is a sort (and the only one at the moment). So the kind parameters are only used to specify the
kind of the type parameters. The ( ..) => .. part is a stanard type lambda form.

You can have an anonymous type lambda but it can’t have kind variables in the signature.
Just as you can have a functional lambda, but it can’t have type variables in it.

At the moment. The reason is, an arbitrary chain of lambdas couldn’t be parsed, unless Felix
had the “self-kinding” kind stuff, that is, any type would be a kind in the right context.
So the present model is stratified. It will do for the moment.

There’s more to be done, but the hard job was separating type aliases (which can have type indices)
from type functions (which have kind indices). Indicies go in the [] brackets. Ordinary functions can be
polymorphic too and have type indicies, as do polymorphic data types.

> I assume you need forwards references, if you don't then you won't actually need a separate compiler pass. For example in LambdaPi forwards references are not allowed, hence the symbol table is simply passed as an argument to the type checking function which does a recursive descent of the abstract symbol tree, and that also deals with scoping the definitions too. There is a single symbol table for all definitions.

Felix uses set-wise lookup. There are no declarations, only definitions.
So you can refer to a symbol defined anywhere in the current scope or any enclosing scope:

fun f() => g();
fun g() => 1;

So the RHS of f() calls g, which isn’t defined “yet”. This is done in one pass with recursion.
It’s VERY hard because Felix has overloading. But NOT for type functions yet!

The problem with overloading is to bind a function you need to know which one
is being called so you need to know the type of the argument, and you need to know
the type of the candidcate functions. Worse, the user doesn’t have to specify the return
type of a function, or the type of a variable.

So the algorithm does temporary partial binding on the fly. The partial binding
to find a function return type tries to calculate the type of the argument
of every return statement (but ignores the rest of the function code).

Oif course, since the expression involves function applications .. it has to do
overload resolution again .. recursively.

the real KILLER is this type:

typeof(expr)

It’s a real super bitch. Here we have invese dependent typing. A type is dependent
on the type of an expression, the evaluation of which in turn requires overload
resolution .. ARRGGHHHH……

Anyhow, recall I want to define a “generic” compose type function.
This was always possible for type functions like

A -> B followed by B -> C

and you could do it for

A -> B1 * B2 followed by B1 * B2 -> C

but you couldn’t use the same name for these because there is no overloading
for type functions.

That’s actually WHY the kind variables were needed. Note you can do it
already universally for ordinary functions because you have type variables.

I am really curious how KINDS interact with type classes.

KINDS are constraints on types. So are type classes.
Are these interchangeable? Could I have a kind ADDABLE which has
the same impact as a type class Addable?

In Felix at least the machinery is utterly different. KIND constraints,
like types, are checked immediately. On the other type class constraints
are only checked on momorphisation (well actually they’re not so much checked
as the compiler can or can’t find an instance).

—
John Skaller
ska...@internode.on.net

Keean Schupke

unread,

Jan 7, 2022, 9:33:09 AM1/7/22

to felix google

Hi John,

My understanding of the difference between type-classes and kinds is that they are similar (overlapping) but different. Kinds like types are closed, in that we must define all the types in a kind in a single declaration, just like we must define all the values in a type in a single declaration. Logically we can use negation (it makes sense to ask if a value is not in a type, or a type not in a kind). Type classes are open. We can at any time declare a new instance of a type-class, so we cannot use negation.

Further to this type-classes control overloading, so that only the type-class instances are considered for overloading, kinds have nothing to do with overloading in that sense, however you can have functions that are kind-polymorphic, which is a separate feature from kinds themselves. You can have a kind system with no overloading at all.

So kinds + kind-polymorphism would be like closed type-classes I think. You gain logical soundness, but lose extensibility.

Cheers,

Keean.

John Skaller2

unread,

Jan 7, 2022, 3:26:44 PM1/7/22

to felix google

> On 8 Jan 2022, at 01:32, Keean Schupke <ke...@fry-it.com> wrote:
>
> Hi John,
>
>
> My understanding of the difference between type-classes and kinds is that they are similar (overlapping) but different. Kinds like types are closed, in that we must define all the types in a kind in a single declaration, just like we must define all the values in a type in a single declaration. Logically we can use negation (it makes sense to ask if a value is not in a type, or a type not in a kind). Type classes are open. We can at any time declare a new instance of a type-class, so we cannot use negation.

Good thought.

>
> Further to this type-classes control overloading, so that only the type-class instances are considered for overloading,

That’s not actually what happens in Felix at all. Instances aren’t considered for overloading at all.
All the overloading is done between *specialised* type class functions, completely independently
of instances. Only after a class and specialisation is selected does the system check if there’s
an unamiguously most specialised instance. So actually, there are two stages of overloading,
one on type classes, and then again on instances.

Stage one is done with ordinary overloading. Note, the class is chosen first, then the function
within it. Stage two occurs during monomorphisation.

> kinds have nothing to do with overloading in that sense, however you can have functions that are kind-polymorphic, which is a separate feature from kinds themselves. You can have a kind system with no overloading at all.

Actually I do use kinds for overloading. For example:

fun f[T:COMPACTLINEAR] (x:T) => …
fun f[T:TYPE] (x:T) => …

f(true) // calls the first one because bool = 2, which is compact linear
f(1) // calls the second on because int is not compact linear

COMPACTLINEAR is a subkind of TYPE. This is actually the *reason* Felix needs kinds.
The problem is that the representation of a pointer to a compact linear type requires
three machine words: one word for the machine address, and a divisor and modulus.

I also now do some experiments with BORROWED and LINEAR kinds. I think I have not
succeeded in my experiments with uniqueness typing. I am using combinators.

uniq int
borrowed int

and subtyping rules:

uniq < shared < borrowed

but this is unsound because the conversion uniq to shared to borrowed is semantically different
from uniq to borrowed so it isn’t transitive. A uniq value can be shared, which moves the uniq value.
So teh original variable is put in “dead” state. A shared variable can be borrowed which copies it.
But if a uniq value is borrowed, it is copied, but remains in “live” state (and we trust the borrower
not to steal it).

All the presentations I’ve seen start with a substructural type system and mark shared types.
I’m starting with shared types and marking substructural ones.

The reason for the kinding here is that a complex data type may contain a unique one,
for example, and has kind UNIQUE, even though the actual type does not “start with the
uniq combinator”.

Im thinking of discarding my effort and switching to the uniqueness and linear typing
being attributes of pointers. So only pointers can be considered references which
are shared or unique or linear..

—
John Skaller
ska...@internode.on.net

John Skaller2

unread,

Jan 9, 2022, 5:41:53 PM1/9/22

to felix-l...@googlegroups.com

Weirdness. An anonymous type function is just a type. Of course it can only be
applied. However you can typedef one:

typedef f[T] = fun (T:TYPE):TYPE=>T;

which is a serious problem. What should happen is the RHS is alpha converted

fun (U:TYPE):TYPE=>U

so there’s no conflict with the LHS T. Normally the T would be deduced by unification
on application, but it’s not clear if this happens (I’ll have to check). So you’d have to specify it.

The problem of shadowing isn’t new, you can nest type lambdas.

With type functions there’s shouldn’t be a problem becaue the scheme parameters
are kinds not types. But that leads to the issue how to detect a use:

f[k] (T)

The problem is whether k is a kind or a type: we need to know *in the parser* before
type type checking can see. So it has to be syntax driven. If k is a kind, then T must be too!
If k is a type, T must be too. But there’s no way to tell. So we’re going to need this:

f[<k>] (T)

for kinds … grrrr. In which case this would be sane:

f[<k>][int] (T)

Note,

f[<k>](K)

means to replace kind variables occurring in K with the correponding
kind in k.

HMMMM..

—
John Skaller
ska...@internode.on.net

flx_fbuild_prep

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

blu...@fi.muni.cz

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

Garrett Bluma

John Skaller2

John Skaller2

Garrett Bluma

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2

John Skaller2