Akka Clustering - Gossip Convergence and Leader Election Problem

405 views
Skip to first unread message

Joo Lee

unread,
Dec 2, 2017, 5:05:02 AM12/2/17
to Lagom Framework Users
Hello everyone,

Has some question regarding the leader election / Gossip Convergence in Akka Clustering.

This link says :

About Gossip Convergence:
Gossip convergence cannot occur while any nodes are unreachable. The nodes need to become reachable again, or moved to the down and removed states (see the Membership Lifecycle section below). This only blocks the leader from performing its cluster membership management and does not influence the application running on top of the cluster. For example this means that during a network partition it is not possible to add more nodes to the cluster. The nodes can join, but they will not be moved to the up state until the partition has healed or the unreachable nodes have been downed.


About Leader:
After gossip convergence a leader for the cluster can be determined. There is no leader election process, the leader can always be recognised deterministically by any node whenever there is gossip convergence. The leader is just a role, any node can be the leader and it can change between convergence rounds. The leader is simply the first node in sorted order that is able to take the leadership role, where the preferred member states for a leader are up and leaving (see the Membership Lifecycle section below for more information about member states). 


And correct me if I am wrong, but my understanding is that the leader CANNOT be selected / elected /changed if the Gossip Convergence cannot happen. Therefore, if the cluster is experiencing the split-brain problem (one node being unreachable), it means that Gossip Convergence is not happening, therefore none of the nodes in the cluster should change its leader.

However, this was not the case for me when I just tested in our staging environment.


Test Environment:
After applying the clustering changes to our portfolio service, we deployed it into three nodes. Let's call them Node 0,1 and 2.

Node = {0,1,2}
Quorum = 2

Then, when we queried the akka http management end-point to query the cluster topology, we saw that all three nodes were seeing each other, and elected Node 0 as their leader. So far so good.


Then, in order to test the Split Brain situation, we added the below firewall rules to the Node 0, the leader node, to isolate it from the rest of the worlds.

root@trading-portfolio-0:/stashaway# iptables -A OUTPUT -p tcp --dport 2551 -j DROP
root@trading-portfolio-0:/
stashaway# iptables -A INPUT -p tcp --dport 2551 -j DROP



The expected outcome was :

Node 0: 

Leader: Still think of himself as the leader.
Unreachable: Node 1 and Node 2
Reachable: Node 0

Node 1:

Leader: Still think of Node 0 as the leader
Unreachable: Node 0
Reachable: Node 2

Node 2:

Leader: Still think of Node 0 as the leader
Unreachable: Node0
Reachable: Node 1

Actual Outcome:

However, this is the outcome we got:


Node 0

{
   
"selfNode": "akka.tcp://PortfolioService@trading-portfolio-0:2551",
   
"leader": "akka.tcp://PortfolioService@trading-portfolio-0:2551",
   
"oldest": "akka.tcp://PortfolioService@trading-portfolio-0:2551",
   
"unreachable": [
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-1:2551",
           
"observedBy": [
               
"akka.tcp://PortfolioService@trading-portfolio-0:2551"
           
]
       
},
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-2:2551",
           
"observedBy": [
               
"akka.tcp://PortfolioService@trading-portfolio-0:2551"
           
]
       
}
   
],
   
"members": [
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-0:2551",
           
"nodeUid": "1583081060",
           
"status": "Up",
           
"roles": []
       
},
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-1:2551",
           
"nodeUid": "1689105132",
           
"status": "Up",
           
"roles": []
       
},
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-2:2551",
           
"nodeUid": "-763138964",
           
"status": "Up",
           
"roles": []
       
}
   
]
}


Node 1


{
   
"selfNode": "akka.tcp://PortfolioService@trading-portfolio-1:2551",
   
"leader": "akka.tcp://PortfolioService@trading-portfolio-1:2551",
   
"oldest": "akka.tcp://PortfolioService@trading-portfolio-0:2551",
   
"unreachable": [
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-0:2551",
           
"observedBy": [
               
"akka.tcp://PortfolioService@trading-portfolio-1:2551",
               
"akka.tcp://PortfolioService@trading-portfolio-2:2551"
           
]
       
}
   
],
   
"members": [
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-0:2551",
           
"nodeUid": "1583081060",
           
"status": "Up",
           
"roles": []
       
},
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-1:2551",
           
"nodeUid": "1689105132",
           
"status": "Up",
           
"roles": []
       
},
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-2:2551",
           
"nodeUid": "-763138964",
           
"status": "Up",
           
"roles": []
       
}
   
]
}


Node 2
{


   
"selfNode": "akka.tcp://PortfolioService@trading-portfolio-2:2551",
   
"leader": "akka.tcp://PortfolioService@trading-portfolio-1:2551",
   
"oldest": "akka.tcp://PortfolioService@trading-portfolio-0:2551",
   
"unreachable": [
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-0:2551",
           
"observedBy": [
               
"akka.tcp://PortfolioService@trading-portfolio-1:2551",
               
"akka.tcp://PortfolioService@trading-portfolio-2:2551"
           
]
       
}
   
],
   
"members": [
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-0:2551",
           
"nodeUid": "1583081060",
           
"status": "Up",
           
"roles": []
       
},
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-1:2551",
           
"nodeUid": "1689105132",
           
"status": "Up",
           
"roles": []
       
},
       
{
           
"node": "akka.tcp://PortfolioService@trading-portfolio-2:2551",
           
"nodeUid": "-763138964",
           
"status": "Up",
           
"roles": []
       
}
   
]
}





Question:
Clearly, the Gossip Convergence occurred between the Node 1 and Node 2 themselves, and they decided to elect Node 1 as their new leader!

Why is this happening? The Akka http management query result clearly states that Node 1 and Node 2 are not able to reach Node 0. And the Akka document states that "Gossip convergence cannot occur while any nodes are unreachable"?

And it also says "After gossip convergence a leader for the cluster can be determined. There is no leader election process, the leader can always be recognised deterministically by any node whenever there is gossip convergence"  

Why and how Node 1 and Node 2 decided to elect Node 1 as their new leader? Why did they Gossip Converged without waiting for Node 0 to be either removed or joining them back...??


Thanks,

Joo
















Patrik Nordwall

unread,
Dec 3, 2017, 4:28:57 AM12/3/17
to Lagom Framework Users
On Sat, Dec 2, 2017 at 11:05 AM, Joo Lee <joo.an...@gmail.com> wrote:
Hello everyone,

Has some question regarding the leader election / Gossip Convergence in Akka Clustering.

This link says :

About Gossip Convergence:
Gossip convergence cannot occur while any nodes are unreachable. The nodes need to become reachable again, or moved to the down and removed states (see the Membership Lifecycle section below). This only blocks the leader from performing its cluster membership management and does not influence the application running on top of the cluster. For example this means that during a network partition it is not possible to add more nodes to the cluster. The nodes can join, but they will not be moved to the up state until the partition has healed or the unreachable nodes have been downed.


About Leader:
After gossip convergence a leader for the cluster can be determined. There is no leader election process, the leader can always be recognised deterministically by any node whenever there is gossip convergence. The leader is just a role, any node can be the leader and it can change between convergence rounds. The leader is simply the first node in sorted order that is able to take the leadership role, where the preferred member states for a leader are up and leaving (see the Membership Lifecycle section below for more information about member states). 


And correct me if I am wrong, but my understanding is that the leader CANNOT be selected / elected /changed if the Gossip Convergence cannot happen. Therefore, if the cluster is experiencing the split-brain problem (one node being unreachable), it means that Gossip Convergence is not happening, therefore none of the nodes in the cluster should change its leader.

There can be multiple leaders, one on each side of a network partition. Those will not perform the mentioned management tasks until there is convergence, i.e. the network partition has been healed or the unreachable been removed. Then these tasks will only be performed by one leader. It's also worth noting that the cluster membership state is like a Conflict Free Replicated Data Type and even if it would be modified by more than one node it will eventually converge to the same state everywhere.
As mentioned above, there can be more than one, and that is what is shown in this information. It says nothing about convergence, but since there are unreachable there is no convergence 
 

Question:
Clearly, the Gossip Convergence occurred between the Node 1 and Node 2 themselves, and they decided to elect Node 1 as their new leader!

Why is this happening? The Akka http management query result clearly states that Node 1 and Node 2 are not able to reach Node 0. And the Akka document states that "Gossip convergence cannot occur while any nodes are unreachable"?

And it also says "After gossip convergence a leader for the cluster can be determined. There is no leader election process, the leader can always be recognised deterministically by any node whenever there is gossip convergence"  

Why and how Node 1 and Node 2 decided to elect Node 1 as their new leader? Why did they Gossip Converged without waiting for Node 0 to be either removed or joining them back...??


Thanks,

Joo
















--
You received this message because you are subscribed to the Google Groups "Lagom Framework Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lagom-framework+unsubscribe@googlegroups.com.
To post to this group, send email to lagom-framework@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lagom-framework/7e8760f5-6bab-4c5b-a362-0749365de6fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Patrik Nordwall
Akka Tech Lead
Lightbend -  Reactive apps on the JVM
Twitter: @patriknw

Joo Lee

unread,
Dec 4, 2017, 9:23:24 AM12/4/17
to Lagom Framework Users
Thanks a lot Patrick. It makes a lot of sense.
Do you think I am bit too over-reacting if I say the documentation can be improved by being more explicit about the fact that there can be multiple leaders? I just feel that it is slightly misleading when it says "Gossip convergence cannot occur while any nodes are unreachable" and then says " After gossip convergence a leader for the cluster can be determined" which led me to think that the all-nodes-are-reachable is the requirements for leader selection process.
To unsubscribe from this group and stop receiving emails from it, send an email to lagom-framewo...@googlegroups.com.
To post to this group, send email to lagom-f...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages