Felix currently supports uniqeness types using a combinator:
uniq T
and an coercion
box x: T -> uniq T
to construct one. When a boxed value is stored in a variable, the variable
is said to be live. No actions at all are allowed on a boxed variable,
other than to move the box out of it, leaving the variable in a dead state.
For example
var x = box 1;
var y = x;
At this point, variable x is dead, and variable y is live and contains the
box originally found in cx.
A box can be passed to a function, leaving the holding variable, if any,
dead:
var x = box 1;
var n = f x;
The function acquires the box, it now resides in the function’s
parameter, and the variable x is dead.
A variable must be dead when it goes out of scope otherwise
the box will be inaccessible and the resource its contents refer
to will leak. So this function is invalid:
fun f() { var x = box 1; return 1; }
because the box is lost, and so its contents are also.
In order to use the object refered to by a boxed value,
the the box has to be removed. This is done by the unbox operator.
var chp = boxed (malloc[char] 42);
fun clear (q: uniq &char, int n) {
var p = unbox q;
memset (p,char 0, n);
return box p;
}
chp = clear chp;
Here the function clear accepts the box, making chp dead,
then it unboxes it, clears the referent of the previously
wrapped up pointer, reboxes the pointer and returns it,
resulting in the variable being made live again due to the assignment.
Now we must be clear of the contracts involved. Unboxing a reference
is always safe because we OWN the reference uniquely. We can do
anything with it, including modify the referent, deleting it, or sharing it.
We can do this even in a pure function, because any effects cannot be
observed since we have been *promised* we’re the sole owner.
Passing a box to a function is NOT transparent: it changes the state
of the source variable. But the function itself remains pure.
Boxing a value is not automatically safe. A programmer boxing
a value is *asserting* sole ownership. In the function above,
the variable q is sharable. It is in fact shared (aliased) by passing
it to memset. However we know memset forgets it on returning.
So after memset is called, p is the sole alias again, and we are
justified in boxing its contents, because by the time the box
is possessed by another variable,m the function has returned
and the shared reference p is forgotten.
Note the proof here demands not only that the variable being
boxed is indeed solely owned, but it will not be used after
the boxing. A counter example:
var p = unbox q;
var r = box p; // OK
println$ p; // This is safe but ..
delete p;
return r; // WOOPS
It’s safe to do the println$ because it only BORROWS p.
So even though r is boxed, asserting its contents are unique
reference, the original unboxed sharable reference is indeed
shared, violating the contract. However the violation is benign
because println doesn’t reshare it.
Even the delete is perfectly fine, because we have not USED
the variable r yet, But when we return it, we have lost control
and it now happens to contain a dangling pointer..
THE POINT is that boxing is unsafe and the programmer
is responsible for using it correctly.
Note that using shared values is intrinsically unsafe!
We can delete an object even if we are not the sole owner.
This is why we might us refcounting or GC.
============
Now, a unique value can be safely passed to a function expecting a shared value:
the unbox operation is inserted implicitly. This is because uniq T is a subtype of T.
This also impacts nested uniques in tuples, for example.
Furthermore, a nested function can access a unique value and this is tracked
by the uniqueness checker if the nested function is explicitly called.
Note that a function can only make the parent uniq dead, since it cannot
assign to it, since functions aren’t allowed to have side effects.
Indeed this is one reason we need to think about borrowing!
But we’ll come to that in a minute. More of a problem is if a nested
function which accesses a parent unique is
(a) passed as an argument to another function
(b) returned
Note that if we did explicitly call a nested function which uses
the uniq, the uniq becomes dead, so we can only do it once
unless we relived the variable in between.
Felix creates closures by passing a pointer to the parent stack frame
to the child. So the issue is the same now as addressing a uniq.
The current system simply ignores all these cases. We don’t know
how many times a closure is called, and we don’t know what happens
to a pointer to a unique variable: is it used to read or write the variable and
if so how many times and in what order? We don’t know, without global
data flow analysis which tracks aliases.
============
Now the next issue: at present a List::map rerturns a uniq list.
So if we just say
var x = map f lst;
x is unique and cannot be shared. This breaks old code
written before we had unique. I have fixed the library to say
var x = unbox (map f lst);
to get back the old behaviour but only a few functions so far in the
library use uniq. There’s no problem if the variable is typed:
var x : list[int] = …
because the unboxing is implicit due to subtyping rules.
But its very inconvenient. At one time you could say
once x = …
and it meant the x was uniq. There was no uniq type constructor then.
We could allow:
shared x = …
to force the unboxing of uniqs, and
uniq x = …
to either require the RHS to be uniq OR to box the RHS if it is
not. in the latter case it is a contract assertion, rather than a type
constraint. So I don’t like that. I prefer it be an error to say
uniq x = 1; // ERROR
you have to say
uniq x = box 1; // OK
The idea is just that you’re expecting a function to return a uniq and should get
an error if it doesn’t.
===============================
Now FINALLY to borrowing. The idea of borrowing is simple enough:
it is a pain to move a uniq to a function, and force the function to move it
back in the return value AND the client to assign it:
var p = box ...
def var n, p = f p;
We have to use “def” here because it unpacks a tuple and then
allows the components to be assigned OR to initialise a new variable.
This is ugly. The idea of borrowing is that a box contents are copied
and put i a transparent wrapper which is passed to a function
WITHOUT making the variable dead. The function can then use
the referent provided it FORGETS it before returning.
in other words this is equivalent to making the variable dead,
then on return relivening it (without an assignment), Since
nothing in a single thread of control can happen to the variable,
unless the called function does it (since the caller is suspended),
this should be fine EXCEPT for pointer aliases (as always).
So lets think about
borrow T
type. If a parameter has this type a uniq or shared value
can be passed safely, if uniq it is NOT made dead.
Now the function has to unwrap the parameter to make
it accessible as usual but is then required to FORGET it
before returning. Mutation is still acceptable.
A borrowed value can be passed to another function.
It is accessible again after that function returns.
The function has to abide by the same rules.
So an unboxed value can NOT be safely passed:
the caller must be sure the function does the right thing.
Borrowed values can NOT be implicitly made shared!
Unlike uniq parameters we don’t own the referent.
On the other hand, a uniq value can always be loaned
safely, provided the borrower abides by the contract.
So uniq T is a subtype of borrowed T.
Again, a shared value can be passed safely where a borrowed
value is expected. So T is subtype of borrowed T.
SUBTYPE RULES:
uniq T -> T -> borrowed T
Note that by transitivity
uniq T -> borrowed T
Now we need some value operators. To access a borrowed
value, we can try “unwrap”. unlike unbox, this operation
is UNSAFE. This means it imposes a responsibility on the
unwrapper to FORGET the unwrapped value.
unwrap: borrowed T -> T
Wrapping is safe:
wrap: T -> borrowed T
lend: uniq T -> borrowed T
These operations are safe and can be implicit.
==============================
SUMMARY.
==========
The system I propose is based on the contract programming method.
Unlike systems like Rust, UNSAFE operations are intrinsic to the design.
An unsafe operation imposes a responsibility to obey the contract on
the user of the unsafe operation.
The safe operations, on the other hand, assume the provider did
the right thing.
The type system has two roles. First, it provides the safe implicit coercions
based on subtyping rules. Secondly, auxilliary algorithms must track
liveness states.
Borrowed values must be forgotten. This means they cannot be put on the heap.
As usual, pointers to variables, by either addressing or closure formation
are not tracked.
A borrowed parameter expresses a contract assertion that the borrowing
function will forget the parameter and it’s contents. For a local
variable, borrowed type implies that the variable will not be used
after the variable it borrowed from (if uniq) is killed.
There are thus two kinds of contracts: those that are enforced
by the compiler, including those that are involve implicit safe
coercions, and those that impose constraints on the programmer
to ensure the resulting system is sound.
UNLIKE other systems, the type system does NOT ensure soundness.
Rather, it partitions the responsibility, so that it is responsible for
enforcement of parts of a contract, and the programmer the other parts.
The resulting system is SIMPLE and the programmer responsibilities
MANIFEST. The programmer should be able to prove their part
of the contract is adhered to, so that the compiler type checks,
which prove the other parts are adhered to, result in a sound system.
All this with a caveat that the compiler type system cannot track
aliases. Explicit addressing is MANIFEST so use of the addresses
is again the programmer responsibility.
Unfortunately, CLOSURE FORMATION is NOT manifest (explicit)
and is therefore difficult for both the compiler and programmer
to account for.
Still .. something is better than nothing. We do not support
throwing out expressivity just because the resulting system
allows coding which is hard to prove correct. Restrictions just lead
to work around which are even harder to prove correct.
—
John Skaller
ska...@internode.on.net