In these kinds of discussions the word "atomic" has several different
meanings, and it's important to keep them clear. When you say that
reading a bool variable is atomic, you probably mean that the memory
load cannot be mixed with any other memory read or write. That is
true. But there is another important point there, which is when a
memory write M done by processor A can be seen by processor B, and
what that implies, if anything, about when processor B will see other
writes done by processor A before or after M. In the code above, it
is possible that B will see the write to "done" before it sees the
write to "a", even though A does them in the order. You need to use
atomic instructions to avoid seeing the unexpected write ordering.
The runtime support for channels has a comment explaining why the
memory ordering doesn't matter in that particular case.
Ian