atom replicates algo

2 views
Skip to first unread message

Borislav Iordanov

unread,
Feb 18, 2009, 1:18:02 AM2/18/09
to HyperGraphDB
Hi Cipri,

I get a deadlock in RememberTaskServer.handleAccept. The line:

if (getThisPeer().getLog().registerRequest(peerId, last_version,
current_version))

blocks because the registerRequest method does a
'current_version.wait()' but the notify is in 'finishRequest' which is
only called after the wait releases....in short a deadlock that I
don't know how it could have worked.

Also, the serialization of HG persistent handles in Structs.java was
not working and querying returned random results (because when you do
new UUIDPersistentHandle you get a new random UUID). I fixed and
committed that, but the above problem has to do with the depths of
your replication algorithm so I need help :) All I'm trying to do for
is change the value of an atom at a remote peer via
RemotePeer.replace(....).

Best,
Boris

--
"Frozen brains tell no tales."

-- Buckethead

Ciprian Costa

unread,
Feb 19, 2009, 9:55:26 AM2/19/09
to hyperg...@googlegroups.com
Hi Boris,

    I will have to see what the problem is. I am sure that it worked :), but I have not tested it in detail after the merge. I'll take some time in the weekend to try and solve it.

Regards,
Cipri

Ciprian Costa

unread,
Feb 21, 2009, 6:09:07 AM2/21/09
to hyperg...@googlegroups.com
Hi Boris,

   The way it was designed is that there could be multiple RememberTaskServers runniing and while one is in handleAccept waiting, another one should reach finishRequest and notify all that are waiting.  The reason I did this is that we should not allow  ordering changes when saving the replicated values. So if changes A and B were made in this order but B arrives before A, B will block untill A arrives, finishes and signals B to proceed.

  I suspect that in your case, somehow A got lost and never arrived so B can not proceed.  We should have some logic to detect this and maybe recover from such cases. I will look into the algorithm see if I can think of a better way to implement it. Any thoughts?

Regards,
Cipri

Borislav Iordanov

unread,
Feb 21, 2009, 11:41:08 AM2/21/09
to hyperg...@googlegroups.com
Hi Cipri,

We probably need to get on chat to clarify some details because I
don't see what exactly could have happened - I don't understand how
the ordering works and what sort of state is maintained.

In my case, there was only one change event: a replace for an atom.
However, the replace wasn't working because of typing issues and
exception were being thrown, so I was debugging this and restarting
the peers over and over again. So I guess, we definitely need a "peer
dropped" event implemented and perhaps that will help solve your
problem?

Boris

PS I'm playing around with some changes to the framework that
shouldn't impact the replication algorithm. I think I'll just commit
when I'm done and if turns out I've done something terribly wrong,
it's always possible revert/readjust :)

Borislav Iordanov

unread,
Feb 21, 2009, 3:33:58 PM2/21/09
to hyperg...@googlegroups.com
In AbstractActivity, there are methods 'afterStateChanged' and
'stateChanged': what's the difference? why two methods with seemingly
exact same purpose?

Ciprian Costa

unread,
Feb 22, 2009, 6:27:14 AM2/22/09
to hyperg...@googlegroups.com
Hi Boris,

   Maybe the naming is not very fortunate, but afterStateChanged is used to managed changes to the internal state of the activity while stateChanged is called when the listners should be informed that the state was changed. The ideea is that you have two steps changing the state and letting the listners now that the state has changed.  afterState changed is part of the first step, while stateChanged is the second step (as I said, not a very fortunate naming).

Regards,
Cipri

Ciprian Costa

unread,
Feb 22, 2009, 6:29:57 AM2/22/09
to hyperg...@googlegroups.com
Hi Boris,

   Let me now when you have time to chat. The lock was surely generated by stopping and starting peers. I am thinking that instead of having the peer dropped scenario we could replace locking with versioning. In that case, all changes that have smaller versions than the current data could just be ignored.

Regards,
Cipri

Borislav Iordanov

unread,
Feb 22, 2009, 12:10:57 PM2/22/09
to hyperg...@googlegroups.com
On Sun, Feb 22, 2009 at 6:29 AM, Ciprian Costa <cipria...@gmail.com> wrote:
> Hi Boris,
>
> Let me now when you have time to chat. The lock was surely generated by
> stopping and starting peers. I am thinking that instead of having the peer
> dropped scenario we could replace locking with versioning. In that case, all
> changes that have smaller versions than the current data could just be
> ignored.

Ah, I like the idea! This is certainly the more logical think to do.
Tracking peer presence is probably tricky and error-prone, but I still
believe that in general some form of it is needed.

Borislav Iordanov

unread,
Feb 22, 2009, 1:02:02 PM2/22/09
to hyperg...@googlegroups.com
I think the listeners should be always called. Otherwise, the design
looks patchy and the rules of the framework are not clear.

In general, as I mentioned, I'm a bit uncomfortable with the flow of
conversations and tasks. The rules to follow when creating new ones
are not easy to figure out. There's this asymmetry b/w the peer
initiating an activity (task or conversation) and the other peers that
create an activity locally as a result of a received message which is
buried in the logic. The 'msg' member variable for tasks is used just
for that. Conversations are activities, but are never submitted to the
executor - their 'run' method does nothing. It is only possible to
start a conversation as a response to a task message, but why is that?
When should activities override the 'handleMessage' method and what to
do there is not clear - for example, one has to read the "framework"
code to learn that 'handleMessage' is not called when the first
message about an activity is received. As a result the 'startTask'
method is assumed to play a dual role: either initiate the task or
respond to the first message of a task initiated by another peer. And
the naming is also a bit confusing because 'starting a task' sounds
like 'initiating a task that nobody know about yet'. When exactly are
state change listeners called and when does a state change without
calling the listeners and why, etc. Conversations have those
transitions based on performatives, but tasks have transitions based
task states&conversation state, but you can have mutliple
conversations per task, so how does that effect the state transitions
for the latter when you have conflicting transitions? So I'm having of
a hard eliminating all those confusing details while preserving the
basic framework&logic.

Best,
Boris
Reply all
Reply to author
Forward
0 new messages