I’m a little confused by this thread…these are generic distributed
programming concepts and issues, applicable when a centralized piece
of code accesses remote objects…or distributed pieces of code access
remote objects. I don’t think they should be solved in any special way
for the swarm…if you need to enlist a bunch of local objects in a
distributed transaction monitor, so be it – no special “swarm” changes
needed…?
The point of the swarm is to get the code to move – there are in
essence no remote objects, since the code moves to them when it needs
them. It will not take the other pieces with it, except some locally
aggregated data (local vals/vars).
Critical section is a misnomer in this case:
http://en.wikipedia.org/wiki/Critical_section
simply because when the swarm is local to a piece of data, it will
contend with local code not just other swarms, so classic locks must
be used.
Do you mean this is trying to deal with a scenario like this: code
reads local data on node 1 and moves on to node 2. Meanwhile, the data
on node 1 has changed and now our image is incorrect, especially if
the code now moves back to read it again?
This is a transactionality/consistency question: all the resources
must concur when the code completes. There’s a bunch of pessimistic
and a bunch of optimistic implementations, all dealing with Einstein’s
universally proven lack of synchronization between distant observers.
However, when adapting these to the swarm, there’s some issues:
1. while the code is co-located with the data, you would use local
means to ensure consistency of the data, i.e. synchronized() critical
sections or other locks, for the duration of the local access
2. while the code is away,
a. You can’t notify a piece of code of remote data changes – it’s not
a BPEL with asynchronous handlers…it’s just a piece of sequential
statements…maybe you could inject exceptions into it – hmm, maybe…but
that will mess up its stack trace (execution state)…
b. You do not know ahead of time what remote data you need (to acquire/
lock all resources)…you discover that as the code runs its course
through its if/else branches…
c. You can’t leave lingering remote locks, to pessimistically lock the
data…unless you start requiring and relying on other 3rd party
objects: the distributed lock managers.
Since data access is encapsulated in Ref(), we could simply track the
version of the data we read and, when the data has changed (the local
nodes will make sure the version changes) we’ll notice that upon re-
reading. Or, at the end, in a very 2PC manner, re-check all versions…
just before a “commit”.
Also, about local locks: we probably need a “with” block – just like
the delimited continuations, we need to mark the places in the code
where we use a local data:
with (ref) { // context is local and may place lock
… // context may move to different node
} // context has to move back to release the local lock
Although – in my point 2.b) I figured that local locks should not
linger as the swarm/context moves, so maybe in the “with” section, the
code gets really mad if you try to move it? Does this make the with()
{…} a critical section in the sense that the swarm doesn’t move?
with (ref) { // context is local and may place lock
… // context may NOT move to different node – what does it do if it
needs to?
} // context release the local lock and the MAY move
Cheers,
Razie