Safe concurrency WAS: Interesting ideas about concurrency and mutable state from Pony lang

167 views
Skip to first unread message

Tom Kaitchuck

unread,
May 24, 2015, 11:52:13 PM5/24/15
to ceylon...@googlegroups.com

I am forking this discussion into a new thread, as it does not relate very much to Pony.


I have been thinking for a while about how to achieve safe concurrency. Reading up on Pony, it is similar to Rust's approach. While they are totally different in implementation, what they are doing is looking at what programmers would do in their language by convention to maintain thread safety and codifying those practices into the language itself with some automation.


Rust is modeled very similarly to C++ and it's concurrency is based on similar primitives. In order to be thread safe in C++ the developer needs to be very rigorous about ownership. If ownership is always clearly established, the thread safety is easy. They have simply baked in rules into the compiler such that you MUST establish clear ownership. Pony is similar but using an actor model.


Neither of these implementations appropriate to Ceylon, but the idea is. At some point, perhaps in the 2.0 time frame Ceylon will develop its own Concurrency model. It would be good if it guaranteed safety and was modeled on the conventions developers follow today in its absence.


The following is one way thread safety could be guaranteed. (Thread safety consists of; no data races and no deadlocks or livelocks. This does not imply that the code is correct.)


I am going to distinguish between deep and shallow immutability. IE: A class's members don’t change vs if nothing it references members can change.


Deeply Immutable objects are implicitly thread safe.

Shallowly Mutable objects may or may not be. (We would like to guarantee that they are.)

Shallowly immutable objects can be considered thread safe if all of the objects they reference are thread safe.


In the absence of automated means, a good convention to follow would be to ensure that any shallowly immutable object is thread safe via synchronization. If they can be shown to be correct one does not need to worry about any other classes. In Java this is where all synchronization end up going.

A shallowly mutable object is guaranteed to be thread safe if all of the following are true:

  • It provides locking or equivalent over all of each of its individual methods. (synchronized block, read/write locks, and software transactional memory all work fine.)

  • It does make any blocking calls. (To guard against livelock)

  • If there is no ulterior way to modify its members or the contents of its members without calling one of its methods. (IE: No other class has a reference to any mutable object it holds.)

  • Calling its methods will not cause a change that would invalidate any of the above. (IE: It does not have a method that returns a mutable object it continues to hold, nor does it have a method that accepts and stores a mutable object that the caller still holds a reference to.)


For the most part these are things you would do naturally if you were writing thread safe code. However to enforcing or automating them creates added complexity.


One way to do this would be to add additional information into the type system. The minimal information that would need to be added would be some sort of indicator if an Object can be guaranteed to be deeply immutable. (Unknown can be treated as mutable)


I am going to invent some syntax as follows:

  • The type of a: Mutable object, a Shallowly Immutable but deeply mutable object, and an object of unknown mutable remains the same as it is today.

  • For objects the user creates (not builtins) that they wish to guarantee to always be deeply immutable, a ! is placed at the end of the class name. (This is validated by checking that all of its members are either immutable builtins or also types ending in !)


A variable whose type is an interface is assumed to be potentially mutable. This allows the implementer to change the implementation without affecting others.


Tracking blocking functions requires adding information to their type. This could be accomplished by making the type: ‘blocking String()’ instead of simply ‘String()’ and enforced by adding a ‘blocking’ annotation on blocking functions and requiring methods calling them to be similarly annotated.


To avoid restrictions on all the methods of mutable objects, there should be an annotation for methods that mutate their member variables. This way the read-only methods are not affected.


Given all of this one could write the following:


class Example1(variable String a) {

void printA() {

print(a);

}

mutates void append(String newString) {

a = a + newString;

}

}


And:


class Example2() {

value list = ArrayList<String>();

void printList() {

print(list);

}

void add(String newString) {

list.add(newString);

}

void addAll(List<String> newStrings) {

for (String s in newStrings) {

list.add(s);

}

}

}


All of which could be made thread safe by the compiler if needed with no extra work for the developer.


It would however restrict what one can do:

class IllegalExample() {

List<String> value list = ArrayList<String>();

void illegalMethod(ArrayList<String> newStrings) {

list = newStrings; //BAD: Assigns without mutates annotation

}

mutates void illegalMethod2(List<String> newStrings) {

//BAD: function annotated with ‘mutates’ receiving a potentially mutable object

list = newStrings;

}

void validMethod() { //OK: No local mutation

String s = getString();

list.clear();

list.add(s);

}


/** Can be invoked like: foo.alsoValid(ArrayList<String>) */

mutates void alsoValid(List<String>()! newStrings) {

list = newStrings();

}

}


The same applies to blocking:


class BlockingExample(blocking String() getString) {

variable value list = ArrayList<String>();

mutates void foo() { //BAD: mutates function making blocking call.

String s = getString();

list = ArrayList<String>();

list.add(s);

}

void foo2() { //OK

String s = getString();

updateList(s); // Indirection of modification to another function.

}

mutates void updateList(String s) {

list = ArrayList<String>();

list.add(s);

}

}


This would have very little impact on day to day coding as modifications can be moved down into a private method or into a data structure and blocking calls can be moved up into a wrapping method.


That would all work great until someone tries to implement a map. A map implementation shouldn’t be limited to immutable items nor should the caller be burdened to pass in constructors/functions. A map *should* be able to take anything for the item, even if it is mutable. It is safe because it is not going to call methods on it.


To facilitate this we need one more feature; the ability to indicate a generic type should be treated as immutable even if the class provided at runtime is not. This could be done by putting an ! at the end of the generic type name just as one would with a real type. Then HashMap’s signature would look like:  


class HashMap<Key!, Item!> extends Basic satisfies MutableMap<Key!,Item!> given Key! satisfies Object!


Notice here that we require the implementation of Key! to actually be immutable by having it extend Object! while Item! is not restricted as it extends Anything. So it is perfectly possible to provide List<String> as the value for Item! even though it is mutable. This is safe because the map cannot going to call methods on Item!.


So why is all of this safe?

It can easily meet all of the criteria that are sufficient for safety layed out above:

  • Any given assignment can have synchronization or STM code injected to protect it.

  • A mutable object making a blocking call can easily be detected.

  • A nested mutable object can not receive the state it needs to guard from the outside, nor will it ever vend it externally.

  • Finally lock ordering can be guaranteed because there are no call stack cycles among mutable objects.


It of course is still possible to write wrong code. IE: The following is invalid whether the map was made thread safe automatically or manually:

value list = ThreadSafeMap<String,String>();

void foo() {

if (list.contains(“foo”) {

String? foosValue = list.get(“foo”);

if (!exists foosValue) {

throw ImpossibleException(); //This can actually occur...

}

}

}

The above and similar code will always require some form of annotating to be correct under any circumstances. In this case adding the ‘mutates’ annotation even though the compiler would not error without it, would be one way to fix the issue.



Renato Athaydes

unread,
Jun 6, 2015, 5:10:57 PM6/6/15
to ceylon...@googlegroups.com
I think the important thing for Ceylon is to have a concurrency model which does not directly expose Threads and locks (and hopefully does not use locks at all, as in Pony) as, obviously, that wouldn't work in JavaScript... but also because it is possible to do like Pony, and even Clojure's core.async library, and abstract concurrency in terms of channels and "go" procedures, for example.

The Ceylon type system can be enhanced to prove safety when sharing memory (mutable data) between procedures (which might be Threads), especially if it is able to guarantee which references are visible and where they can be read and written, which is what the Pony papers proves to be possible.

In the post above, the assumption seems to be that the use of locks and Threads is essential to achieving concurrency in Ceylon, when in fact, it is not (to achieve true parallelism, yes, we will have to recur to the platforms' primitives, but that should not be essential to the model, and is not even possible in JS).

It is essential for Ceylon to be able to "mark" which types are immutable (not only for concurrency, in my opinion...), which obviously makes concurrency much easier, but it is not only about mutability.... a reference which is mutated only by a single Thread, atomically, can be safely read by multiple Threads without locks. Achieving atomicity may actually require locks in the JVM, but that's not exposed to the programmer and has no risk of causing deadlocks as there's no application code-level locks.

I dislike having to annotate mutator methods because the compiler should be able to analyse "variable" references being accessed and deduce which methods are mutating much better (and always correctly) than the programmer. Every time a method mutates a "variable" the compiler could "mark" the method as unsafe to use in more than one Thread... Accessing the reference could still be safe in multiple Threads...

So, the secret is that concurrency is possible not only with immutability, but with isolated mutable data structures, from which only immutable, or also isolated mutable data can be seen.

From the Pony docs:
"By sharing only immutable data and exchanging only isolated data we can have safe concurrent programs without locks. The problem is that it's very difficult to do that correctly. If you accidentally hang on to a reference to some isolated data you've handed over or change something you've shared as immutable then everything goes wrong. What you need is for the compiler to force you to live up to your promises. Pony capabilities allow the compiler to do just that."

I do not see how this is not achievable in Ceylon.

Tom Kaitchuck

unread,
Jun 6, 2015, 8:33:55 PM6/6/15
to ceylon...@googlegroups.com

On Saturday, June 6, 2015 at 2:10:57 PM UTC-7, Renato Athaydes wrote:
I think the important thing for Ceylon is to have a concurrency model which does not directly expose Threads and locks (and hopefully does not use locks at all, as in Pony) as, obviously, that wouldn't work in JavaScript... but also because it is possible to do like Pony, and even Clojure's core.async library, and abstract concurrency in terms of channels and "go" procedures, for example.

Yes, absolutely. If your code compiles it should be thread safe without you having to think about it.
 
The Ceylon type system can be enhanced to prove safety when sharing memory (mutable data) between procedures (which might be Threads), especially if it is able to guarantee which references are visible and where they can be read and written, which is what the Pony papers proves to be possible.

Yes.
 
In the post above, the assumption seems to be that the use of locks and Threads is essential to achieving concurrency in Ceylon, when in fact, it is not (to achieve true parallelism, yes, we will have to recur to the platforms' primitives, but that should not be essential to the model, and is not even possible in JS).

I don't make this assumption. Rather I am saying that one implementation could use locks. There are others. (Making a STM like closure is the most obvious). But the user need not be exposed to what the implementation is. 
 
It is essential for Ceylon to be able to "mark" which types are immutable (not only for concurrency, in my opinion...), which obviously makes concurrency much easier, but it is not only about mutability.... a reference which is mutated only by a single Thread, atomically, can be safely read by multiple Threads without locks. Achieving atomicity may actually require locks in the JVM, but that's not exposed to the programmer and has no risk of causing deadlocks as there's no application code-level locks.

I dislike having to annotate mutator methods because the compiler should be able to analyse "variable" references being accessed and deduce which methods are mutating much better (and always correctly) than the programmer. Every time a method mutates a "variable" the compiler could "mark" the method as unsafe to use in more than one Thread... Accessing the reference could still be safe in multiple Threads...

Detecting blocking and mutable methods could be automated by the compiler. After all the compiler need to be able to check if the constraints are violated, it could infer if it instead. However I am not sure I would this to be so transparent as a developer. As the caller of a method I usually want to know if it is going to block or change the object I am calling it on. 
 
So, the secret is that concurrency is possible not only with immutability, but with isolated mutable data structures, from which only immutable, or also isolated mutable data can be seen

Yes, exactly.
 
From the Pony docs:
"By sharing only immutable data and exchanging only isolated data we can have safe concurrent programs without locks. The problem is that it's very difficult to do that correctly. If you accidentally hang on to a reference to some isolated data you've handed over or change something you've shared as immutable then everything goes wrong. What you need is for the compiler to force you to live up to your promises. Pony capabilities allow the compiler to do just that."

I do not see how this is not achievable in Ceylon.

I believe the above is one way to do so.

Gavin King

unread,
Jun 7, 2015, 5:57:18 AM6/7/15
to ceylon...@googlegroups.com
I would like to point out that the kind of analysis you're talking
about here is something that you guys could implement as:

- a couple of annotations written in Ceylon, and
- a Visitor for the typechecker's AST.

AFAICT, this kind of stuff doesn't actually require very much in the
way of impact on the existing codebase, so if it's something you care
about, it's something *you* could contribute.
> --
> You received this message because you are subscribed to the Google Groups
> "ceylon-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ceylon-users...@googlegroups.com.
> To post to this group, send email to ceylon...@googlegroups.com.
> Visit this group at http://groups.google.com/group/ceylon-users.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/ceylon-users/939b4b6e-e9ce-483c-9466-23e178509a74%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Gavin King
ga...@ceylon-lang.org
http://profiles.google.com/gavin.king
http://ceylon-lang.org
http://hibernate.org
http://seamframework.org

Dirk Lattermann

unread,
Jun 7, 2015, 1:10:12 PM6/7/15
to ceylon...@googlegroups.com
Am Sat, 6 Jun 2015 14:10:57 -0700 (PDT)
schrieb Renato Athaydes <ren...@athaydes.com>:

> [...]

> Every time a method mutates a "variable" the
> compiler could "mark" the method as unsafe to use in more than one
> Thread... Accessing the reference could still be safe in multiple
> Threads...
>
> So, the secret is that concurrency is possible not only with
> immutability, but with isolated mutable data structures, from which
> only immutable, or also isolated mutable data can be seen.

There still need to be memory barriers so data structures modified by
one thread become visible eventually in other threads. I don't think it
would be desirable to only use JVM volatile variables.

Dirk

Renato Athaydes

unread,
Jun 7, 2015, 3:33:03 PM6/7/15
to ceylon...@googlegroups.com
AFAICT, this kind of stuff doesn't actually require very much in the
way of impact on the existing codebase, so if it's something you care
about, it's something *you* could contribute.


 I would love to play with the type checker, annotations processing and compiler. I've asked about this before.... do you have a compiler plugin guide or something to help with that?

Renato

Gavin King

unread,
Jun 7, 2015, 3:44:20 PM6/7/15
to ceylon...@googlegroups.com
Well, no, because we don't have any welldefined API for it (yet), but
honestly to hack a new Visitor into the lifecycle of PhasedUnit is
trivial. Just look at how the existing typechecker Visitors work.
> --
> You received this message because you are subscribed to the Google Groups
> "ceylon-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ceylon-users...@googlegroups.com.
> To post to this group, send email to ceylon...@googlegroups.com.
> Visit this group at http://groups.google.com/group/ceylon-users.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/ceylon-users/b4a5fe5b-5f26-40e6-98f4-124370a40e0e%40googlegroups.com.

Tom Kaitchuck

unread,
Jun 8, 2015, 2:08:55 PM6/8/15
to ceylon...@googlegroups.com

On Sunday, June 7, 2015 at 2:57:18 AM UTC-7, Gavin King wrote:
I would like to point out that the kind of analysis you're talking
about here is something that you guys could implement as:

- a couple of annotations written in Ceylon, and
- a Visitor for the typechecker's AST.

AFAICT, this kind of stuff doesn't actually require very much in the
way of impact on the existing codebase, so if it's something you care
about, it's something *you* could contribute.

Glad to hear it! I suspected as much for mutates, and blocking. It also seems like `!` could turned into a parameter annotation like maybe 'immutable' the same goes for class declarations (and perhaps ! or some other symbol could be sugar for the annotation). But what about for generics? My understanding is that an annotation like this:
class Foo<Immutable T> () ... 
Is not supported. Is there a good way to deal with that case? 

Gavin King

unread,
Jun 8, 2015, 2:59:57 PM6/8/15
to ceylon...@googlegroups.com
On Mon, Jun 8, 2015 at 8:08 PM, Tom Kaitchuck <tom.ka...@gmail.com> wrote:
>
> But what about for
> generics? My understanding is that an annotation like this:
> class Foo<Immutable T> () ...
> Is not supported. Is there a good way to deal with that case?

To me it seems more like a type constructor than like an annotation. i.e.

Foo<Immutable<T>>

I mean, if it's just an annotation, then it wouldn't be propagated by
type inference, and wouldn't affect assignment rules. that can't be
right.

Tom Kaitchuck

unread,
Jun 8, 2015, 5:33:57 PM6/8/15
to ceylon...@googlegroups.com
On Monday, June 8, 2015 at 11:59:57 AM UTC-7, Gavin King wrote:
On Mon, Jun 8, 2015 at 8:08 PM, Tom Kaitchuck <tom.ka...@gmail.com> wrote:
>
>  But what about for
> generics? My understanding is that an annotation like this:
> class Foo<Immutable T> () ...
> Is not supported. Is there a good way to deal with that case?

To me it seems more like a type constructor than like an annotation. i.e.

    Foo<Immutable<T>>

I mean, if it's just an annotation, then it wouldn't be propagated by
type inference, and wouldn't affect assignment rules. that can't be
right.

Perhaps it should be a type. However what does this do to the caller? If they call foo(Immutable(t)); and get the results out by doing Immutable<T> t ..;  T t1 = t.get();
Then there is nothing to prevent the Map from doing the same thing and invoking methods on T. Additionally this imposes a runtime overhead.


Alternately if there were something a bit more magic like:
MyList<T>() given T satisfies Opaque { ...
}
Where 'Opaque' is a synonym for Anything but is not allowed to be combined with any other constraints. (IE: one cannot say `satisfies Opaque & Comparable`) This would solve the problem. 


Reply all
Reply to author
Forward
0 new messages