concurrency issues

8 views
Skip to first unread message

Ian Clarke

unread,
Oct 16, 2009, 6:14:17 PM10/16/09
to swarm-...@googlegroups.com
I've been thinking about the problems we'll need to deal with if/when
the load balancer (not sure that "load balancer" is an accurate term)
needs to move data between machines.

What if I do something like:

var dog = dogRef() // Dereference dogRef to get the dog it points to
...stuff...
...more stuff...
At the point in time, the load balancer decides to move the object
dogRef points to to a different computer
...more stuff...
dog.makeDead() // we call a method on dog that mutates the dog ie. we
kill the dog

BUT - the actual dog pointed to by dogRef() is no-longer the same
object that was placed in the "dog" variable! We've mutated the wrong
object!

Does this make sense?

Ian.

--
Ian Clarke
CEO, Uprizer Labs
Email: i...@uprizer.com
Ph: +1 512 422 3588
Fax: +1 512 276 6674

Bayani Portier

unread,
Oct 16, 2009, 9:46:22 PM10/16/09
to swarm-...@googlegroups.com
Are you talking something along the lines of Tombstoning in Lotus Domino?

Ian Clarke

unread,
Oct 16, 2009, 9:52:06 PM10/16/09
to swarm-...@googlegroups.com
I'm having trouble finding a good reference for that, could you
elaborate on what "Tombstoning" is?

Ian.

Bayani Portier

unread,
Oct 16, 2009, 10:00:12 PM10/16/09
to swarm-...@googlegroups.com
Basic principle behind it is that when you kill an object, or move an object away from a server in our instance, you will always either have it point to a tombstone object (indicating that it's dead), or to the new reference point that you are referring off to.
 
So for instance:
 
var dog = dogRef()   // Dereference dogRef to get the dog it points to (@a1457283)

...stuff...
...more stuff...
At the point in time, the load balancer decides to move the object
dogRef points to to a different computer (@b14233123) -> Server object that will hold the new data

...more stuff...
dog.makeDead() // we call a method on dog that mutates the dog ie. we
kill the dog (@b14233123) kills dog on remote server
 
BUT - the actual dog pointed to by dogRef() is no-longer the same
object that was placed in the "dog" variable!  We've mutated the wrong
object! (@a1456283) points to (@b13233123), which points to a tombstone, indicating that it's dead.
 
Does this make sense?
 
There would need to be quite a bit more present to maintain information for GC.
 
-b

Rick R

unread,
Oct 16, 2009, 10:46:31 PM10/16/09
to swarm-...@googlegroups.com
I think this is exactly why the Darkstar guys make such a strong distinction between reads and writes on copies (caches) of an object.  They also recognize a central authority on the objects themselves. I'm not sure if we're willing to do either in Swarm.

What we are addressing is concurrency control. I can think of two choices off hand:
Reader/writer locks. This implies a quasi centralized authority on a domain of objects. But it's known to be scalable (e.g. BigTable) 
The other option is MVCC and concensus, these are more flexible and can be decentralized, but don't scale well to a lot of participants.

Rick R

unread,
Oct 16, 2009, 11:04:14 PM10/16/09
to swarm-...@googlegroups.com
Another option. I'm not entirely sure that this is possible, but it seems to follow the swarm principles.
We make it mandatory that all objects must be referenced through Refs, so they're no longer Refs, more like Bastions. So if you attempt to mutate the underlying object, the Ref checks the location of the object, and if it is on another machine, it moveTo's the correct machine. 

I like this idea, but it is not without flaws. Primarily, we're just delaying solving the concurrency problem :)  
In addition, it adds a lot of weight to interacting with objects.

Thoughts?

Patrick Wright

unread,
Oct 17, 2009, 3:42:25 AM10/17/09
to swarm-...@googlegroups.com
It seems like there are lots of approaches. One is a sort of
node-specific locking system, where you acquire a data item for either
read or write, e.g. within a transaction, before you start work, and
while you have that lock it is not allowed to move. Similarly, if the
process responsible for moving data marks a node for a move operation,
it can no longer participate in a transaction on the node where it's
currently located.

Bayani's notion of a tombstone reference could be used passively, e.g.
you get an exception if a node you try to access within your process
has been moved and the reference is now a tombstone.

Just thinking out loud, it seems we would want to reduce the chance of
failing transactions due to data moves. One way to do this is to run
the operations on the data on a different time cycle, so to speak,
than the data-reorg process. So a reorg which attempts to co-locate
data, happens on a very slow, leisurely schedule. The operations
against the data operate at full speed. That might reduce the
contention between the two; an additional advantage is that moving
data will likely be (relatively) slow and expensive, and you may want
to defer this to off-peak times.

It seems there has to be some sort of prioritization, where the
request to write and the request to move are both balanced against
each other. Part of the prioritization would include elapsed time,
e.g. a request to move gets greater priority the longer it remains
unfulfilled. At some point, the node says, "sorry, guys, I know you
need to work on this data, but it's needed elsewhere as well." and
moves it.

Patrick Wright

unread,
Oct 17, 2009, 4:02:21 AM10/17/09
to swarm-...@googlegroups.com
> I think this is exactly why the Darkstar guys make such a strong distinction
> between reads and writes on copies (caches) of an object.  They also
> recognize a central authority on the objects themselves. I'm not sure if
> we're willing to do either in Swarm.

Just a thought, I think their approach has advantages in a distributed
system. You have to address "the fallacies of distributed computing".
Nodes can crash--either the program, or the hardware-- admins can muck
up the network config making nodes unreachable, you can have failure
at any stage while moving data between nodes, etc. If you decentralize
everything, you need ways to recover when things go wonky. Central
authorities can help track the state of the overall system, and let
you know what processes were incomplete so you can restart them, for
example.

Bayani Portier

unread,
Oct 17, 2009, 6:42:50 AM10/17/09
to swarm-...@googlegroups.com
The read/write distinction in DarkStar makes absolute sense to me. I've found in all of the systems that I've worked on that you rarely get cyclical read/write conditions. Data tends to funnel through a series of transactions, and tends to expand and contract along the way.
 
The trick would be to statistically identify where the data contracts, and identify them as "gate points". When you hit a gate point, the algorithm can then easily package up the state data along with the process data, and flick it to the next data shard.
 
The idea would then be to have all of the data between gate points to be present on one server. This would be in two parts. Read data, and write data.
 
The read data is fine, as you can slave off copies. (there is a reason why I'm using the slave/master model). It is the write data that is tricky. You would want to have as much if not all write data locally within a shard. This may mean that several gate points would need to be on the same machine. You can only have one master write copy of data present, with as many read slave segments over many server shards.
 
At least this is what I've had in my head for a while now as to how I suspect a system would want the ideal solution to look.
 
-b

Rick R

unread,
Oct 17, 2009, 11:11:47 AM10/17/09
to swarm-...@googlegroups.com
I guess we should make a distiction between the two problems we're solving here:

1. Multi Version Concurrency Control
2. Ergonomics

I'm hoping it's possible to solve #2 and create the appropriate interfaces with which we can later experiment with  solutions for #1

I agree that there are a lot of approaches. I think the swarm methodology, if I may so bold to assume it, is to make the distributed system as transparent as possible.  So whatever we come up with, the net effect is that it needs to be no different than a completely unencumbered local application.  

The best I can come up with would be making every object extend SwarmObject. Which provides a get and put method (or overloads apply, whatever).  The parent SwarmObject would be responsible for managing the location of the execution of said object. In addition, the get and put methods would talk to the appropriate nodes and get the appropriate locks for reading and writing.

Sadly (IMO) Scala doesn't seem to support the notion of dynamic methods like ruby or python would, so we will, at the very least have to call
foo.get("blah") and foo.put("blah", newvalue) 
instead of   
val a = foo.blah and  foo.blah = newvalue.

Any ideas?

Ian Clarke

unread,
Oct 17, 2009, 11:18:12 AM10/17/09
to swarm-...@googlegroups.com
I think I'm a bit confused, perhaps because the "mutation" I happened
to choose was "makeDead()", but this was arbitrary - it could have
been dog.incrementAge().

How does dog.makeDead() affect the dog on the remote server?

It seems like we need to change what the variable "dog" is pointing to
on the fly, but I'm not sure that is possible with the JVM...

I guess I'm being slow today :-)

Ian.

On Fri, Oct 16, 2009 at 9:00 PM, Bayani Portier

Patrick Wright

unread,
Oct 17, 2009, 11:36:56 AM10/17/09
to swarm-...@googlegroups.com
> The best I can come up with would be making every object extend SwarmObject.
> Which provides a get and put method (or overloads apply, whatever).  The
> parent SwarmObject would be responsible for managing the location of the
> execution of said object. In addition, the get and put methods would talk to
> the appropriate nodes and get the appropriate locks for reading and writing.
>
> Sadly (IMO) Scala doesn't seem to support the notion of dynamic methods like
> ruby or python would, so we will, at the very least have to call
> foo.get("blah") and foo.put("blah", newvalue)
> instead of
> val a = foo.blah and  foo.blah = newvalue.

First, I don't think you can erase the distinction between local and
distributed computing. [1]

That aside, I think there's a pretty fundamental problem with the
current Ref design, which I pointed out in another thread. It's fine
if the continuation can stop on reaching an unapply for a Ref, and be
pushed over to another node where that Ref resides. But as soon as we
access the Ref in its local space (the node where it resides) and
assign it explicitly or implicitly to a local variable in the
continuation, we are going to carry it with us if and when we move to
other nodes. The more nodes we access, the more data we potentially
carry with us. That's expensive, potentially too expensive if we
happen to hold on to the root of a large object graph. I think we have
to somehow make an explicit distinction between data which is local to
the continuation and data which we can query, but will not actually
want to hold on to.

I think that in some ideal form, the idea of the wandering
continuation fits best to some accumulator model. That is, the purpose
and goal of the continuation is to visit nodes and collect data (say,
statistics) as it visits, without actually retrieving data in the
normal case. At most, it may retrieve a set of "keys", which it
delivers somewhere before it exits.

One approach I've thought of is that a Ref would not actually have a
normal unapply to access its data. Rather, you can either get the key
to the Ref (its UUID, so to speak), or you can ask the Ref to perform
some operation for you, using a closure. While we can't prevent you
from accidentally holding on to some remote data within the closure,
we could enforce some constraints, via an (as of now imaginary)
compiler plugin:
- from within a closure executed by a Ref, one is provided (via a
parameter) access to the data pointed to by the Ref, but are not
allowed to reference Refs directly within the block
- the return from the closure must not be a Ref, nor contain a
reference to a Ref

something like that. People may still make mistakes, but maybe we can
keep them away from the most egregious ones.

What I'm a little more interested in right now is the issue of the
conceptual model of programming within Swarm, rather than the issues
of concurrency control or balancing data across nodes. I think what
Ian has demo'ed so far is exciting, but needs a more solid conceptual
model before I'd find it useful.

Patrick

1: "A Note on Distributed Computing",
http://research.sun.com/techrep/1994/abstract-29.html

Rick R

unread,
Oct 17, 2009, 3:17:23 PM10/17/09
to swarm-...@googlegroups.com

First, I don't think you can erase the distinction between local and
distributed computing. [1]

 
Change the "can" to a "should" and I would probably agree with you there :)
I just figured that the point of Swarm was to assume that all of your computations should be distributed.
 
That aside, I think there's a pretty fundamental problem with the
current Ref design, which I pointed out in another thread. It's fine
if the continuation can stop on reaching an unapply for a Ref, and be
pushed over to another node where that Ref resides. But as soon as we
access the Ref in its local space (the node where it resides) and
assign it explicitly or implicitly to a local variable in the
continuation, we are going to carry it with us if and when we move to
other nodes. The more nodes we access, the more data we potentially
carry with us.

That's true, and it's a rather interesting side effect. It's like "Katamari Damacy" in a way :).
For that reason, as well as a myriad of others, things would be much easier if this were a pure functional system. No mutable state means no locking or access control required.

Perhaps it would work as partially functional system like Erlang, in which the workers can read and copy state, but they can't modify it directly. They have to ask an external database (like mnesia) which does offer its own access control. 
 
One approach I've thought of is that a Ref would not actually have a
normal unapply to access its data. Rather, you can either get the key
to the Ref (its UUID, so to speak), or you can ask the Ref to perform
some operation for you, using a closure.

I like this idea, and Scala makes this syntactically very easy. It also matches the model transaction management in modern languages. We could offer an  "update" (set a value) "updateWith" (retrieve the value and use it to replace value), "updateIf", etc.  We could perhaps take steps to ensure that "get" returns a copy of the value, not the value itself. 
 
...

What I'm a little more interested in right now is the issue of the
conceptual model of programming within Swarm, rather than the issues
of concurrency control or balancing data across nodes. I think what
Ian has demo'ed so far is exciting, but needs a more solid conceptual
model before I'd find it useful.

 
I agree completely. Once we decide on usage/interfaces, we can muck all we want to with different implementations. I think scala, with remotable continuations, offers the ability to give Swarm a much more natural, usable interface than other distributed systems.   I think it will take some experimentation to determine exactly what that is.


Patrick Wright

unread,
Oct 17, 2009, 3:51:52 PM10/17/09
to swarm-...@googlegroups.com
>> First, I don't think you can erase the distinction between local and
>> distributed computing. [1]
>
> Change the "can" to a "should" and I would probably agree with you there :)
> I just figured that the point of Swarm was to assume that all of your
> computations should be distributed.

I think I replied to the wrong part of your earlier message, and maybe
I'm misunderstanding your point there. What I was replying to was "is


to make the distributed system as transparent as possible. So
whatever we come up with, the net effect is that it needs to be no

different than a completely unencumbered local application." It's that
point of view that the article I pointed to disagrees with.

I don't think that's possible to treat local calls and remote calls
the same, and I don't think it's useful to aim for it, except in the
sense that, sure, we can hide the details of data and computation
transfer from the end developer. It can be a comfortable API to use.
And sure, central to the proposal for Swarm is distributed
computation. But we can't (and shouldn't) obscure the underlying
difficulties.

I want to communicate the right tone here: yes, distributed, yes, easy
to use, yes, scalable. Sure. But we will have errors where somewhere
along its travel between nodes, for example when a continuation can't
move to the next node (node is unreachable; transfer breaks halfway
through; transfer succeeds but connection is lost before the client
realizes this). Or the next node isn't accepting more requests. Or
there's a serialization error (because the target node doesn't have
the correct version of the classfiles). Or the target nodes starts
working, then crashes. And so on. Those are just realities. I see them
all the time where I work, despite our best effort and intentions.

So the challenge as I see it is how to make Swarm relatively easy to
use, while still letting Swarm users build reliable distributed
systems.


> For that reason, as well as a myriad of others, things would be much easier
> if this were a pure functional system. No mutable state means no locking or
> access control required.
>
> Perhaps it would work as partially functional system like Erlang, in which
> the workers can read and copy state, but they can't modify it directly. They
> have to ask an external database (like mnesia) which does offer its own
> access control.

This is just a riff--but makes me think of XSL, in some way. A
computation visits the nodes of a tree that it's interested in, but
its goal is not to modify the tree, but to produce a tree of it's own
as output.

Maybe this would be one type of computation in Swarm, a sort of pure
visitor which simply could not have side-effects on the system it was
visiting. Updates would happen through a different API. It somewhat
inverts the Actor model (we aren't sending data, we are sending
computations), but keeps immutability as a core principle.

Rick R

unread,
Oct 17, 2009, 5:52:09 PM10/17/09
to swarm-...@googlegroups.com

This is just a riff--but makes me think of XSL, in some way. A
computation visits the nodes of a tree that it's interested in, but
its goal is not to modify the tree, but to produce a tree of it's own
as output.


I haven't used XSL, but I'm told it has been proven that it's basically a pure functional language.  So yes.
Haskell could probably provide some more useful, general purpose examples. But there is a reason why non-mutable state is the core of the Erlang system, and why Haskell lends itself so well to concurrency.

Non mutable state effectively removes *all* concurrency issues.  Obviously, though, such a system isn't the most intuitive to the average programmer.

 
Maybe this would be one type of computation in Swarm, a sort of pure
visitor which simply could not have side-effects on the system it was
visiting. Updates would happen through a different API. It somewhat
inverts the Actor model (we aren't sending data, we are sending
computations), but keeps immutability as a core principle.


To point to Haskell again, there is a general pattern forming when dealing with IO and other impure systems. They write as much of the system as they can in the form of pure functions, then pipe data through those functions inside of impure (IO) monads. 

I think if we put a restriction on the continuation that it had to be pure. And we made all data modifications occur via messages to and from our data stores (whether centralized or distributed) We would have a great interface that made the appropriate abstractions. Particularly, we could treat the continuation as a local function, because its location really wouldn't matter (to the user, not to the system). Then when they receive the results of their computation, they could store it in the data store which will handle the appropriate locks.

Just a thought...
 

Ian Clarke

unread,
Oct 18, 2009, 10:52:36 AM10/18/09
to swarm-...@googlegroups.com
On Sat, Oct 17, 2009 at 4:52 PM, Rick R <rick.ri...@gmail.com> wrote:
> I think if we put a restriction on the continuation that it had to be pure.
> And we made all data modifications occur via messages to and from our data
> stores (whether centralized or distributed) We would have a great interface
> that made the appropriate abstractions.

Can you write some Scala code to demonstrate what this might look like
in practice (ie. not the code to implement it, but some "user" code
that assumes its already been implemented).

I think this would give us a good sense of the "ergonomics" of the
approach you are suggesting.

Rick R

unread,
Oct 18, 2009, 12:44:40 PM10/18/09
to swarm-...@googlegroups.com

I've got a couple hours today to devote to this, will try to get both the RemoteActor and the sample job nailed down.

Ian Clarke

unread,
Oct 18, 2009, 12:47:45 PM10/18/09
to swarm-...@googlegroups.com
Rick,

You mentioned before that you were working on simulating various
candidate load balancing algorithms, did you ever make any progress on
this?

Ian.

Rick R

unread,
Oct 18, 2009, 12:53:20 PM10/18/09
to swarm-...@googlegroups.com
Not at all.  I started to build a simulation system into the existing Swarm, then realized that there were some fundamental design issues that needed nailed down.  As I expressed before, I think the algorithms themselves are less important than the interface to them.  Hopefully this week I will build a simple test harness.

Neil

unread,
Oct 20, 2009, 5:37:13 AM10/20/09
to Swarm Discussion
Just to throw my two cents in here (and make my first post in the
process!)

On the tombstoning idea that was discussed, could we not simply throw
a tombstone exception if the dereferenced object has been relocated?

-Neil.

On Oct 18, 5:53 pm, Rick R <rick.richard...@gmail.com> wrote:
> Not at all.  I started to build a simulation system into the existing Swarm,
> then realized that there were some fundamental design issues that needed
> nailed down.  As I expressed before, I think the algorithms themselves are
> less important than the interface to them.  Hopefully this week I will build
> a simple test harness.
>
>
>
> On Sun, Oct 18, 2009 at 12:47 PM, Ian Clarke <ian.cla...@gmail.com> wrote:
>
> > Rick,
>
> > You mentioned before that you were working on simulating various
> > candidate load balancing algorithms, did you ever make any progress on
> > this?
>
> > Ian.
>
> > On Sun, Oct 18, 2009 at 11:44 AM, Rick R <rick.richard...@gmail.com>
> > wrote:
>
> > > I've got a couple hours today to devote to this, will try to get both the
> > > RemoteActor and the sample job nailed down.
>
> > > On Sun, Oct 18, 2009 at 10:52 AM, Ian Clarke <ian.cla...@gmail.com>
> > wrote:
>
> > >> On Sat, Oct 17, 2009 at 4:52 PM, Rick R <rick.richard...@gmail.com>

Ian Clarke

unread,
Oct 20, 2009, 8:08:30 AM10/20/09
to swarm-...@googlegroups.com
On Tue, Oct 20, 2009 at 4:37 AM, Neil <sher...@gmail.com> wrote:
> Just to throw my two cents in here (and make my first post in the
> process!)
>
> On the tombstoning idea that was discussed, could we not simply throw
> a tombstone exception if the dereferenced object has been relocated?

Right, but how do we do this within the constraints of the JVM? How
do we make simply using a value in a variable throw an exception?

Rick R

unread,
Oct 20, 2009, 9:26:46 AM10/20/09
to swarm-...@googlegroups.com
I think the issue is the more general case of "How do we handle multiple, simultaneous modifications to the same object from separate nodes?"

Ian used destroy as an example, and that's actually the easiest case. Since changes don't need to be determined, or synced, just discarded.

Here's a fun one:
Let's say you have a game where a player object has N lives. At certain events, the life count will be decremented.  When the count hits 0, the player object gets destroyed. 

Lets say the player object has 1 life, and there is a Ref to it on 2 nodes. Node A causes an event to occur which increases the player's life count.  Concurrently, node B decrements the player's life count to 0 then discards the object.  Now we have to reconcile not only the state of the object, but the actual presence of the object.
Reply all
Reply to author
Forward
0 new messages