RFC: Generalized Immutablity

106 views
Skip to first unread message

Adam Armstrong

unread,
Apr 13, 2014, 10:59:02 AM4/13/14
to kl...@googlegroups.com
This is a draft for a way to add primitives for implementing immutability for custom encapsulation types and built in types.

The proposed design is modeled after const in C and C++, and the applicatives' names are from Ruby.

-----

Rational:

Controlling the ability to mutate an object helps write robust and understandable code; therefore, it should be encouraged by providing tools to limit mutation efficiently and easily.

The gains in robustness and ability to reason about the code are even more pronounced in the presence of multiple threads, which is becoming more common in programs.

Furthermore, the main problem in optimizing Kernel code is the lack of a way to trivially prove the value of variables.
Having a standard way to control mutation should help analyze code more quickly and increase the number of optimizations that can be determined for code that makes good use of the features being proposed here.
While not a free lunch, reducing the hoops a programmer needs to leap through to write code that disallows mutation should lead to fewer for any compilers used of the code.

-----

Summary of additions:

(freeze . objects)

Applicative that returns references to the objects passed to it that have been marked as frozen.

(frozen? . objects)

Applicative that returns true iff all references in objects have been marked as frozen.

-----

Details:

Having an object marked as frozen does not by itself enforce immutability, but if all mutators for a type raise an error when provided a reference that is frozen it reaches the same result.

Having the immutability be optional allows for "logical immutability".
e.g. Promises can have their internal data mutated when forced, but it would not be a violation of immutability because it is invisible to the code using the promise.

Furthermore the mark is applied to the reference rather than the object, so it can still be mutated by code that has access to though an unmarked reference.
In the case where it is desirable to know an object cannot be mutated elsewhere you still need to create a copy and ensure no unfrozen references to the copy escape.
To avoid unnecessary copies if falls to the implementation of the type to track whether a particular object was created in a way that ensures it cannot be mutated.
(e.g. for copy-es-immutable! to avoid unnecessary copying pairs created by it would need additional information beyond the frozen mark)
Also, as freeze does not modify an object the passing frozen and unfrozen references to equ? and equal? should result in #t as they are still the same object.

Furthermore no "unfreeze" applicative that unmarks references is included in this proposal as it would severely limit the usefulness "freeze".
Also having an "unfreeze" would make it more difficult for optimizing compilers to use freeze and frozen to aid in proving that an object is constant in a reliable manner.
Finally, it is possible to mutate  the object if you can get an unfrozen reference to it, thus it is possible but more difficult to do potentially dangerous mutation.

-----

Example Code:

(write (force (freeze ($lazy (+ n 42))))) ; no error, while force technically mutates the promise it is hidden from the program and not considered a mutation.

($let ((x (get-current-environment)) (y (freeze (get-current-environment))))
 ($if ($and (equ? x y) ($not (frozen? x))) ; evaluates true*, they refer to the same environment even though y is frozen.
  ($set! x n 42) ; no error, x is not frozen* even if y is
  ($set! y n 21) ; error, y is marked as frozen
 )
)

* Unless (get-current-environment) returned a frozen reference, which is possible. (e.g. calling $remote-eval and passing a frozen reference to the environment to evaluate in).

-----

Comparison to current means:

Currently for encapsulation types, this can mostly be modeled by having two types.

Having separate types has an advantage over this proposal in that having to use a different accessor for mutable and immutable encapsulations could help avoid mistakes.

This proposal has the advantage of efficiency as using the same accessor for both means procedures that do not mutate the object only need one code path.
It also requires writing less code to achieve the desired result.

While it does not currently apply, if an optimizing compiler for Kernel is created having a standard way of limiting mutation would be helpful for determining how an object could be mutated at run time.

I believe the changes here are the difference between people using proper controls on mutation and not doing so (whether to reduce work, avoid precieved ineffciency, or reduce compile times),
then the increased chance of mistakes being made are acceptable.

One thing that can be done with the proposed method that might not always be possible with just encapsulations is when an accessor returns internal data from the object.
With the proposed method it is possible to provide access to internal data without giving the ability to mutate it or having to make a copy.
Encapsulations can do this, but it requires the infrastructure needed was provided for the type in question.
That may not be the case for types from code written by others and is not currently the case for environments.

An additional note on the difference between the two approaches is when using two encapsulation types the mutable and immutable versions of an object would not be equ? while they would under the proposed method.

-----

Implementation:

If the proposal seems like a reasonable addition to klisp, I will work on implementing the needed changes as life permits once pull request for some MinGW specific fixes I submitted has been accepted or declined.
I am waiting on that since I plan to delete my fork and refork before doing more work because being unfamiliar with Mercurial I made a bit of a mess in the repository (the branch for the pull request is not affected).

Adam Armstrong

unread,
Apr 13, 2014, 3:01:23 PM4/13/14
to kl...@googlegroups.com
Seems I forgot to cover a few points in the original post.

First is that in the Kernel Report it mentions that a means of creating an individual pair was not included in part because it would allow creating evaluation structures with complex mutability schemes.
To me, the main problem with an (immutable-cons) applicative would be that there is either no way to mark the car or cdr as constant unless it is made possible for any code to do so, which would not always be desirable.
Using the proposed primitive, the problem (as I see it) is resolved as you can keep a mutable reference if you might want to modify such properties later.

Second is that with environments it is currently possible to allow read only access by creating and passing an empty child environment.
The proposed method avoids an extra memory allocation when the ability to mutate the environment is not needed.

Third, the references returned by freeze should always be equ? to the corresponding parameter except when it can be proven to not matter.
Specifically, it is technically possible in the case of an empty environment with one parent and without a static keyed variable to return a frozen reference to the parent.
The requirement of proving that the address of the returned object does not matter is needed to prevent this being used to determine the ancestry of an environment.
For uniformity the restriction applies to similar cases for non-environments.

Oto Havle

unread,
Apr 14, 2014, 5:47:15 PM4/14/14
to kl...@googlegroups.com
Hi Adam,

it is an interesting proposal. I think it could be implemented
in Klisp, to see how it turns out. See my comments and questions below.

On 04/13/2014 04:59 PM, Adam Armstrong wrote:
> This is a draft for a way to add primitives for implementing
> immutability for custom encapsulation types and built in types.
>
> The proposed design is modeled after const in C and C++, and the

int const *** const ** const * x; /* legal C :) */

I expect some unintuitive behavior, if the frozen references
are first class and can be stored in another objects and mixed
with mutable references. For example, what can you tell about
these forms:

($let ((x (freeze y))) (append! x z))

($when (not? (frozen? x)) (append! x y))

Do they signal an error? When?

> applicatives' names are from Ruby.
[...]
> *Summary of additions:*
>
> (freeze . objects)
>
> Applicative that returns references to the objects passed to it that
> have been marked as frozen.
>
Nitpicking: Should freeze really allow more than one argument?
Kernel Language does not have multiple value return facility
like CL or Scheme. Multiple results are usually (AFAIK) returned
in lists, e.g. get-list-metrics. This convention does not seem
fit freeze well. If (freeze x y) returns a two element list

(freeze x y) == (list (freeze x) (freeze y))

then (freeze x) should also return a list. But it does not,
so better restrict freeze to one argument (like e.g. symbol->string).

> (frozen? . objects)
>
> Applicative that returns true iff all references in objects have been
> marked as frozen.
>
Does it mean that (frozen? x) scans recursively all references contained
in x? If so, how it does handle encapsulated types?

($define! x (list (list 1 2) 3))
($define! y (freeze x))
(frozen? y) ; #t or #f ?
(frozen? (car y)) ; #t or #f ?

[...]
> Also, as freeze does not modify an object the passing frozen and
> unfrozen references to equ? and equal? should result in #t as they are
> still the same object.
>
[...]
>
> ($let ((x (get-current-environment)) (y (freeze (get-current-environment))))
> ($if ($and (equ? x y) ($not (frozen? x))) ; evaluates true*, they
> refer to the same environment even though y is frozen.
> ($set! x n 42) ; no error, x is not frozen* even if y is
> ($set! y n 21) ; error, y is marked as frozen
> )
> )
>

If equ? means eq?, I tend to disagree. It is a matter of opinion,
but I would like (eq? x y) imply that x and y are indistinguishable,
even though the Kernel Report section 3.7 suggests that it may
not be the case. Your example shows that the program can tell
the frozen reference from the original one.

[...]

Best wishes,

Oto Havle.

Adam Armstrong

unread,
Apr 15, 2014, 11:11:19 AM4/15/14
to kl...@googlegroups.com




   int const *** const ** const * x;  /* legal C :) */

I expect some unintuitive behavior, if the frozen references
are first class and can be stored in another objects and mixed
with mutable references. For example, what can you tell about
these forms:

   ($let ((x (freeze y))) (append! x z))

   ($when (not? (frozen? x)) (append! x y))

Do they signal an error? When?

Indeed, it is problematic when used on lists since they are composed of multiple objects instead of a single container object.
Therefore for append! there would be an error only if the last pair in the list is frozen.

The results could be made a bit more obvious if (car pair) and (cdr pair) returned a frozen reference if pair is frozen.
However, that would raise the question of if the returned reference should be frozen for just references to pairs or for references to all types.
 
My current opinion is that lists of pairs are not a good replacement for proper container types in code that wants to work with a specific data structure, with or without freeze being present.
An example is code that wants a singly linked list. Because pairs can for cyclic lists, it would need to check for cycles. With freeze it would need to check for it when preforming mutations.
For robust code an encapsulation of a linked list would be better as it could (and should) maintain an invariant that there are no cycles and you would only need to see if the container is frozen to see if mutation is safe.

In summary, I think using pairs to represent lists should be done only when their extent is local and they can be determined to not have cycles and to not be frozen by examination of the surrounding code.
When passing lists around the program and ensuring invariants becomes less trivial, use a real data structure that can ensure invariants you need.


> applicatives' names are from Ruby.
[...]
> *Summary of additions:*
>
> (freeze . objects)
>
> Applicative that returns references to the objects passed to it that
> have been marked as frozen.
>
Nitpicking: Should freeze really allow more than one argument?
Kernel Language does not have multiple value return facility
like CL or Scheme. Multiple results are usually (AFAIK) returned
in lists, e.g. get-list-metrics. This convention does not seem
fit freeze well. If (freeze x y) returns a two element list

   (freeze x y) == (list (freeze x) (freeze y))

then (freeze x) should also return a list. But it does not,
so better restrict freeze to one argument (like e.g. symbol->string).


Yes, it should operate on only one object to be more consistent with the rest of the language.
(freeze x) instead of (freeze . objects)

I was thinking of how $define! and $set! work at the time I was writing it up.
($define (x y z) (freeze a b c))
 
> (frozen? . objects)
>
> Applicative that returns true iff all references in objects have been
> marked as frozen.
>
Does it mean that (frozen? x) scans recursively all references contained
in x? If so, how it does handle encapsulated types?

   ($define! x (list (list 1 2) 3))
   ($define! y (freeze x))
   (frozen? y)                      ; #t or #f ?
   (frozen? (car y))                ; #t or #f ?


I do not think it should. If you are passing a list around enough that you need freeze, you probably should be using an encapsulation representing a list instead.
 
[...]
> Also, as freeze does not modify an object the passing frozen and
> unfrozen references to equ? and equal? should result in #t as they are
> still the same object.
>
[...]
>
> ($let ((x (get-current-environment)) (y (freeze (get-current-environment))))
>   ($if ($and (equ? x y) ($not (frozen? x))) ; evaluates true*, they
> refer to the same environment even though y is frozen.
>    ($set! x n 42) ; no error, x is not frozen* even if y is
>    ($set! y n 21) ; error, y is marked as frozen
>   )
> )
>

If equ? means eq?, I tend to disagree. It is a matter of opinion,
but I would like (eq? x y) imply that x and y are indistinguishable,
even though the Kernel Report section 3.7 suggests that it may
not be the case. Your example shows that the program can tell
the frozen reference from the original one.

[...]

Best wishes,

   Oto Havle.


To me the obvious implementation of eq? is to check if the references point to the same memory.

However, I suppose that would not technically be true in all cases if the implementation simply manipulates a type tag to swap between some of the built in types.
(e.g. I can see applicatives and operatives using the same representation, and just having eval treat them differently based on the type tag)

I will be replying later with more on this last subject.


Best wishes,
Adam Armstrong

Andres Navarro

unread,
Apr 15, 2014, 1:18:05 PM4/15/14
to kl...@googlegroups.com
On Sun, Apr 13, 2014 at 11:59 AM, Adam Armstrong <urulo...@gmail.com> wrote:

If the proposal seems like a reasonable addition to klisp, I will work on implementing the needed changes as life permits once pull request for some MinGW specific fixes I submitted has been accepted or declined.
I am waiting on that since I plan to delete my fork and refork before doing more work because being unfamiliar with Mercurial I made a bit of a mess in the repository (the branch for the pull request is not affected).

Sorry about that.  I thought I had accepted it but apparently not...
It's done now.  Thanks!

Regards,
Andres Navarro

Andres Navarro

unread,
Apr 15, 2014, 1:52:41 PM4/15/14
to kl...@googlegroups.com
Regarding the proposal:

While trying the control mutablity is a worthy goal, I'm not quite sure this is the way to go.
Please, don't read any of the following as a discouragement.  Just a little food for thought, based
on my personal opinion.


I favor the use of new encapsulated types with optional immutablity specified in the constructor because
they are easier to reason about and don't require modifcation to the underlying system.

For example see boxes in the racket docs: http://docs.racket-lang.org/reference/boxes.html).
In this context, a box is the minimal element mutable storage.  They can be easily implemented in klisp
with either a one element array or using a cons cell and encapsulation.  You can allow both mutable and
immutable boxes, with the accessor & type-predicate common to both kinds.

Now a way to easily create this type of encapsulations would be nice (say a constructor that returns five
procedures: a constructor for immutables, a constructor for mutables, a type-predicate, a mutable-predicate
and an accessor.  This is more or less trivial to write in terms of make-encapsulation and enables easy
construction of types like the "box" described above (An alternative to this would be a general
record-definition for new types)

Of course there are some drawbacks to this approach, some of which you indentified in your proposal.
and some of them have workarounds.

For a proposed method of making box use more transparent see the srfi 111:
http://srfi.schemers.org/srfi-111/srfi-111.html
Disclaimer: I'm not particularly fond of the autobox/autounbox feature personally, but some people are.

As for the use case of giving access to internal data, I'm not sure it justifies the introduction of references
as first class objects (and remember every entity in Kernel should be first class).  I feel there's always a
cleaner way of giving access to internal data: via procedures, boxes, copies, iterators, cursors, etc.

Once you add (possibly mutable) references as first class object you have to allow to distinguish between
them and the object they refer to (otherwise they wouldn't be first class).  And once you do that: what's the
difference with boxes?  Furthermore, for some special use cases you may need to modify the basic workings
of the interpreter (for auto-boxing/auto-unboxing) and after all that trouble, what have you gained really?

I think references work well in environments where they are second class & the compiler
understands all uses of them and can reason about them statically. C++ is a good example of that.

Regards,
Andrés Navarro



On Sun, Apr 13, 2014 at 11:59 AM, Adam Armstrong <urulo...@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "klisp" group.
To unsubscribe from this group and stop receiving emails from it, send an email to klisp+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andres Navarro

unread,
Apr 15, 2014, 2:00:39 PM4/15/14
to kl...@googlegroups.com
On Tue, Apr 15, 2014 at 12:11 PM, Adam Armstrong <urulo...@gmail.com> wrote:

> *Summary of additions:*
>
> (freeze . objects)
>
> Applicative that returns references to the objects passed to it that
> have been marked as frozen.
>
Nitpicking: Should freeze really allow more than one argument?
Kernel Language does not have multiple value return facility
like CL or Scheme. Multiple results are usually (AFAIK) returned
in lists, e.g. get-list-metrics. This convention does not seem
fit freeze well. If (freeze x y) returns a two element list

   (freeze x y) == (list (freeze x) (freeze y))

then (freeze x) should also return a list. But it does not,
so better restrict freeze to one argument (like e.g. symbol->string).


Yes, it should operate on only one object to be more consistent with the rest of the language.
(freeze x) instead of (freeze . objects)

I was thinking of how $define! and $set! work at the time I was writing it up.
($define (x y z) (freeze a b c))


$define! and $set! work like that because they take "operand trees" like $lambda & $vau forms.
The correct way to define freeze is indeed as taking a single operand and users can them call
map if they need to freeze more than one object.
 
 
> (frozen? . objects)
>
> Applicative that returns true iff all references in objects have been
> marked as frozen.
>
Does it mean that (frozen? x) scans recursively all references contained
in x? If so, how it does handle encapsulated types?

   ($define! x (list (list 1 2) 3))
   ($define! y (freeze x))
   (frozen? y)                      ; #t or #f ?
   (frozen? (car y))                ; #t or #f ?


I do not think it should. If you are passing a list around enough that you need freeze, you probably should be using an encapsulation representing a list instead.
 

But then they aren't that useful for any type of container in my opinion.  What about hashtables, or vectors.  What use is immutability
of a container if you can still mutate the objects contained in them (possibly breaking invariants in the process)?.
 
[...]
> Also, as freeze does not modify an object the passing frozen and
> unfrozen references to equ? and equal? should result in #t as they are
> still the same object.
>
[...]
>
> ($let ((x (get-current-environment)) (y (freeze (get-current-environment))))
>   ($if ($and (equ? x y) ($not (frozen? x))) ; evaluates true*, they
> refer to the same environment even though y is frozen.
>    ($set! x n 42) ; no error, x is not frozen* even if y is
>    ($set! y n 21) ; error, y is marked as frozen
>   )
> )
>

If equ? means eq?, I tend to disagree. It is a matter of opinion,
but I would like (eq? x y) imply that x and y are indistinguishable,
even though the Kernel Report section 3.7 suggests that it may
not be the case. Your example shows that the program can tell
the frozen reference from the original one.

[...]

Best wishes,

   Oto Havle.


To me the obvious implementation of eq? is to check if the references point to the same memory.


eq? behaviour is tricky.  The report was clearly no written with taggable references in mind obviously.
On one hand, the object they point to may be the same.  On the other hand you can do things with
one of the references you can't do with the other, so they may be distinguished.  One more reason
I don't feel comfortable with this proposal...

Regards,
Andres Navarro

Adam Armstrong

unread,
Apr 15, 2014, 2:38:33 PM4/15/14
to kl...@googlegroups.com
I had wondered about it.

Thank you.

Adam Armstrong

unread,
Apr 15, 2014, 4:00:43 PM4/15/14
to kl...@googlegroups.com


On Tuesday, April 15, 2014 1:52:41 PM UTC-4, Andres Navarro wrote:
Regarding the proposal:

While trying the control mutablity is a worthy goal, I'm not quite sure this is the way to go.
Please, don't read any of the following as a discouragement.  Just a little food for thought, based
on my personal opinion.


I favor the use of new encapsulated types with optional immutablity specified in the constructor because
they are easier to reason about and don't require modifcation to the underlying system.
 

For example see boxes in the racket docs: http://docs.racket-lang.org/reference/boxes.html).
In this context, a box is the minimal element mutable storage.  They can be easily implemented in klisp
with either a one element array or using a cons cell and encapsulation.  You can allow both mutable and
immutable boxes, with the accessor & type-predicate common to both kinds.

Now a way to easily create this type of encapsulations would be nice (say a constructor that returns five
procedures: a constructor for immutables, a constructor for mutables, a type-predicate, a mutable-predicate
and an accessor.  This is more or less trivial to write in terms of make-encapsulation and enables easy
construction of types like the "box" described above (An alternative to this would be a general
record-definition for new types)

Of course there are some drawbacks to this approach, some of which you indentified in your proposal.
and some of them have workarounds.

For a proposed method of making box use more transparent see the srfi 111:
http://srfi.schemers.org/srfi-111/srfi-111.html
Disclaimer: I'm not particularly fond of the autobox/autounbox feature personally, but some people are.

As for the use case of giving access to internal data, I'm not sure it justifies the introduction of references
as first class objects (and remember every entity in Kernel should be first class).  I feel there's always a
cleaner way of giving access to internal data: via procedures, boxes, copies, iterators, cursors, etc.


The point of freeze and frozen? in that use case is to identify if write access should be given.
The means of giving or restricting access are left to the procedure and type.
 
Once you add (possibly mutable) references as first class object you have to allow to distinguish between
them and the object they refer to (otherwise they wouldn't be first class).  And once you do that: what's the
difference with boxes?  Furthermore, for some special use cases you may need to modify the basic workings
of the interpreter (for auto-boxing/auto-unboxing) and after all that trouble, what have you gained really?

I think references work well in environments where they are second class & the compiler
understands all uses of them and can reason about them statically. C++ is a good example of that.

Regards,
Andrés Navarro


Indeed it is possible to implement it for types that are aware of the system.
I will probably include a limited implementation that works for types made through an alternate type constructor when I post a revision to how freeze interacts with equ?.

Adam Armstrong

unread,
Apr 17, 2014, 11:05:20 AM4/17/14
to kl...@googlegroups.com
Attached are a demo implementation of freeze and frozen? and some tests that illustrate how they function.
The demo implementation also creates an applicative make-freezable-type that is like make-encapsulation-type except the types made with it work with freeze and frozen?.

The demo conforms to a different set of constraints on equality that originally proposed.
The new constraints are summarized below:

If some object x is frozen?, then the result of passing it to freeze must be eq? to x:
if (frozen? x) then (eq? x (freeze x))

The result of freeze is only eq? to the parameter if it is frozen?:
(eq? x (freeze x)) iff (frozen? x)

If the results to two calls to freeze must be eq? if the parameters are eq?:
if (eq? x y) then (eq? (freeze x) (freeze y))

When the above rules are combine, it leads to the restriction on when the results of freeze may be equ?:
(eq? (freeze x) (freeze y)) iff (or? (eq? x y) (and? (frozen? x) (eq? x (freeze y))) (and? (frozen? y) (eq? (freeze x) y)))
freezing.k
freezing-test.k
Reply all
Reply to author
Forward
0 new messages