How often does Hazelcast look for new/old members?

602 views
Skip to first unread message

jsjunk...@gmail.com

unread,
Feb 2, 2017, 2:34:43 PM2/2/17
to Hazelcast
Hello,

I am trying to test split-brain scenarios using partition groups and I noticed the following: Whenever the network segmentation is fixed, it takes around 1-2 minutes for each and all members to detect that they can see each other. Is there a network configuration parameter that I can change to indicate Hazelcast how often to look for new/existing members? I am using tcp discovery and partition groups

    <properties>
       
<property name="hazelcast.max.no.heartbeat.seconds">5</property>
       
<property name="hazelcast.heartbeat.interval.seconds">1</property>
   
</properties>
   
<network>
       
<port auto-increment="false" port-count="100">5701</port>
       
<outbound-ports>
           
<!--
            Allowed port range when connecting to other nodes.
            0 or * means use system provided port.
            -->

           
<ports>0</ports>
       
</outbound-ports>
       
<join>
           
<multicast enabled="false">
               
<multicast-group>224.2.2.3</multicast-group>
               
<multicast-port>54327</multicast-port>
           
</multicast>
           
<tcp-ip enabled="true">
               
<member-list>
                   
<member>172.16.100.22</member>
                   
<member>172.16.100.23</member>
                   
<member>172.16.100.24</member>
                   
<member>172.16.100.25</member>
               
</member-list>
           
</tcp-ip>
           
<aws enabled="false">
               
<access-key>my-access-key</access-key>
               
<secret-key>my-secret-key</secret-key>
               
<!--optional, default is us-east-1 -->
               
<region>us-west-1</region>
               
<!--optional, default is ec2.amazonaws.com. If set, region shouldn't be set as it will override this property -->
               
<host-header>ec2.amazonaws.com</host-header>
               
<!-- optional, only instances belonging to this group will be discovered, default will try all running instances -->
               
<security-group-name>hazelcast-sg</security-group-name>
               
<tag-key>type</tag-key>
               
<tag-value>hz-nodes</tag-value>
           
</aws>
           
<discovery-strategies>
           
</discovery-strategies>
       
</join>
       
<interfaces enabled="false">
           
<interface>10.10.1.*</interface>
       
</interfaces>
       
<ssl enabled="false"/>
       
<socket-interceptor enabled="false"/>
       
<symmetric-encryption enabled="false">
           
<!--
               encryption algorithm such as
               DES/ECB/PKCS5Padding,
               PBEWithMD5AndDES,
               AES/CBC/PKCS5Padding,
               Blowfish,
               DESede
            -->

           
<algorithm>PBEWithMD5AndDES</algorithm>
           
<!-- salt value to use when generating the secret key -->
           
<salt>thesalt</salt>
           
<!-- pass phrase to use when generating the secret key -->
           
<password>thepass</password>
           
<!-- iteration count to use when generating the secret key -->
           
<iteration-count>19</iteration-count>
       
</symmetric-encryption>
   
</network>
<!--
    <partition-group enabled="false"/>
-->

   
<partition-group enabled="true" group-type="CUSTOM">
       
<member-group>
           
<interface>172.16.100.22</interface>
           
<interface>172.16.100.23</interface>
       
</member-group>
       
<member-group>
           
<interface>172.16.100.24</interface>
           
<interface>172.16.100.25</interface>
       
</member-group>
   
</partition-group>

The network segmentation simulation (using iptables) was restored at 2017-02-02 11:16:30


INFO  2017-02-02 11:13:08,629 [hz._hzInstance_1_dev.cached.thread-3] com.hazelcast.internal.cluster.ClusterService: [172.16.100.22]:5701 [dev] [3.7.5]

Members [2] {
   
Member [172.16.100.22]:5701 - 7cf49a37-03e3-4d58-8490-a4c1a39e6c51 this
   
Member [172.16.100.23]:5701 - 23b5d6af-8741-4dd4-9704-01a3debfeb98
}

INFO  
2017-02-02 11:13:09,777 [hz._hzInstance_1_dev.migration] com.hazelcast.internal.partition.impl.MigrationManager: [172.16.100.22]:5701 [dev] [3.7.5] Re-partitioning cluster data... Migration queue size: 271
INFO  
2017-02-02 11:13:11,419 [hz._hzInstance_1_dev.migration] com.hazelcast.internal.partition.impl.MigrationThread: [172.16.100.22]:5701 [dev] [3.7.5] All migration tasks have been completed, queues are empty.
INFO  
2017-02-02 11:17:35,628 [hz._hzInstance_1_dev.cached.thread-4] com.hazelcast.nio.tcp.TcpIpConnectionManager: [172.16.100.22]:5701 [dev] [3.7.5] Established socket connection between /172.16.100.22:35682 and /172.16.100.25:5701
INFO  
2017-02-02 11:17:35,639 [hz._hzInstance_1_dev.cached.thread-2] com.hazelcast.cluster.impl.TcpIpJoiner: [172.16.100.22]:5701 [dev] [3.7.5] [172.16.100.25]:5701 should merge to this node , because : node.getThisAddress().hashCode() < joinMessage.address.hashCode() , this node data member count: 2
INFO  
2017-02-02 11:17:35,640 [hz._hzInstance_1_dev.cached.thread-4] com.hazelcast.nio.tcp.TcpIpConnectionManager: [172.16.100.22]:5701 [dev] [3.7.5] Established socket connection between /172.16.100.22:39616 and /172.16.100.24:5701
INFO  
2017-02-02 11:17:35,793 [hz._hzInstance_1_dev.IO.thread-in-1] com.hazelcast.nio.tcp.TcpIpConnection: [172.16.100.22]:5701 [dev] [3.7.5] Connection[id=8, /172.16.100.22:39616->/172.16.100.24:5701, endpoint=[172.16.100.24]:5701, alive=false, type=MEMBER] closed. Reason: Connection closed by the other side
INFO  
2017-02-02 11:17:35,796 [hz._hzInstance_1_dev.IO.thread-in-2] com.hazelcast.nio.tcp.TcpIpConnection: [172.16.100.22]:5701 [dev] [3.7.5] Connection[id=7, /172.16.100.22:35682->/172.16.100.25:5701, endpoint=[172.16.100.25]:5701, alive=false, type=MEMBER] closed. Reason: Connection closed by the other side
INFO  
2017-02-02 11:17:35,806 [hz._hzInstance_1_dev.IO.thread-Acceptor] com.hazelcast.nio.tcp.SocketAcceptorThread: [172.16.100.22]:5701 [dev] [3.7.5] Accepting socket connection from /172.16.100.25:50497
INFO  
2017-02-02 11:17:35,808 [hz._hzInstance_1_dev.cached.thread-2] com.hazelcast.nio.tcp.TcpIpConnectionManager: [172.16.100.22]:5701 [dev] [3.7.5] Established socket connection between /172.16.100.22:5701 and /172.16.100.25:50497
INFO  
2017-02-02 11:17:35,812 [hz._hzInstance_1_dev.IO.thread-Acceptor] com.hazelcast.nio.tcp.SocketAcceptorThread: [172.16.100.22]:5701 [dev] [3.7.5] Accepting socket connection from /172.16.100.24:47948
INFO  
2017-02-02 11:17:35,813 [hz._hzInstance_1_dev.cached.thread-2] com.hazelcast.nio.tcp.TcpIpConnectionManager: [172.16.100.22]:5701 [dev] [3.7.5] Established socket connection between /172.16.100.22:5701 and /172.16.100.24:47948
INFO  
2017-02-02 11:17:41,817 [hz._hzInstance_1_dev.priority-generic-operation.thread-0] com.hazelcast.internal.cluster.ClusterService: [172.16.100.22]:5701 [dev] [3.7.5]

Members [4] {
   
Member [172.16.100.22]:5701 - 7cf49a37-03e3-4d58-8490-a4c1a39e6c51 this
   
Member [172.16.100.23]:5701 - 23b5d6af-8741-4dd4-9704-01a3debfeb98
   
Member [172.16.100.25]:5701 - 44f4d062-efa1-44b2-a262-8235400a3c80
   
Member [172.16.100.24]:5701 - afe26ebe-0ab8-441c-8c06-3ec7aa88ca0d
}


Thanks - Juan

Vassilis Bekiaris

unread,
Feb 2, 2017, 3:25:08 PM2/2/17
to haze...@googlegroups.com

Hi Juan,

there are 2 properties which are relevant for your use case:

hazelcast.merge.first.run.delay.seconds --> specify the initial delay after member startup until the split brain handler starts its first execution
hazelcast.merge.next.run.delay.seconds --> determines the interval between each subsequent execution of the split brain handler

Lowering hazelcast.merge.next.run.delay.seconds will enable your member to detect a cluster to merge to more quickly, at the expense of spending some extra cycles executing the handler more frequently.

See more in the properties documentation section: http://docs.hazelcast.org/docs/3.7/manual/html-single/index.html#system-properties

Cheers!
Vassilis
--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at https://groups.google.com/group/hazelcast.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/126061d1-e59f-48ca-a816-921b09a29def%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages