How often does Hazelcast look for new/old members?

602 views

Skip to first unread message

jsjunk...@gmail.com

unread,

Feb 2, 2017, 2:34:43 PM2/2/17

to Hazelcast

Hello,

I am trying to test split-brain scenarios using partition groups and I noticed the following: Whenever the network segmentation is fixed, it takes around 1-2 minutes for each and all members to detect that they can see each other. Is there a network configuration parameter that I can change to indicate Hazelcast how often to look for new/existing members? I am using tcp discovery and partition groups

    <properties>
        <property name="hazelcast.max.no.heartbeat.seconds">5</property>
        <property name="hazelcast.heartbeat.interval.seconds">1</property>
    </properties>
    <network>
        <port auto-increment="false" port-count="100">5701</port>
        <outbound-ports>
            <!--
            Allowed port range when connecting to other nodes.
            0 or * means use system provided port.
            -->
            <ports>0</ports>
        </outbound-ports>
        <join>
            <multicast enabled="false">
                <multicast-group>224.2.2.3</multicast-group>
                <multicast-port>54327</multicast-port>
            </multicast>
            <tcp-ip enabled="true">
                <member-list>
                    <member>172.16.100.22</member>
                    <member>172.16.100.23</member>
                    <member>172.16.100.24</member>
                    <member>172.16.100.25</member>
                </member-list>
            </tcp-ip>
            <aws enabled="false">
                <access-key>my-access-key</access-key>
                <secret-key>my-secret-key</secret-key>
                <!--optional, default is us-east-1 -->
                <region>us-west-1</region>
                <!--optional, default is ec2.amazonaws.com. If set, region shouldn't be set as it will override this property -->
                <host-header>ec2.amazonaws.com</host-header>
                <!-- optional, only instances belonging to this group will be discovered, default will try all running instances -->
                <security-group-name>hazelcast-sg</security-group-name>
                <tag-key>type</tag-key>
                <tag-value>hz-nodes</tag-value>
            </aws>
            <discovery-strategies>
            </discovery-strategies>
        </join>
        <interfaces enabled="false">
            <interface>10.10.1.*</interface>
        </interfaces>
        <ssl enabled="false"/>
        <socket-interceptor enabled="false"/>
        <symmetric-encryption enabled="false">
            <!--
               encryption algorithm such as
               DES/ECB/PKCS5Padding,
               PBEWithMD5AndDES,
               AES/CBC/PKCS5Padding,
               Blowfish,
               DESede
            -->
            <algorithm>PBEWithMD5AndDES</algorithm>
            <!-- salt value to use when generating the secret key -->
            <salt>thesalt</salt>
            <!-- pass phrase to use when generating the secret key -->
            <password>thepass</password>
            <!-- iteration count to use when generating the secret key -->
            <iteration-count>19</iteration-count>
        </symmetric-encryption>
    </network>
<!--
    <partition-group enabled="false"/>
-->
    <partition-group enabled="true" group-type="CUSTOM">
        <member-group>
            <interface>172.16.100.22</interface>
            <interface>172.16.100.23</interface>
        </member-group>
        <member-group>
            <interface>172.16.100.24</interface>
            <interface>172.16.100.25</interface>
        </member-group>
    </partition-group>

The network segmentation simulation (using iptables) was restored at

2017-02-02 11:16:30

INFO  2017-02-02 11:13:08,629 [hz._hzInstance_1_dev.cached.thread-3] com.hazelcast.internal.cluster.ClusterService: [172.16.100.22]:5701 [dev] [3.7.5] 

Members [2] {
    Member [172.16.100.22]:5701 - 7cf49a37-03e3-4d58-8490-a4c1a39e6c51 this
    Member [172.16.100.23]:5701 - 23b5d6af-8741-4dd4-9704-01a3debfeb98
}

INFO  2017-02-02 11:13:09,777 [hz._hzInstance_1_dev.migration] com.hazelcast.internal.partition.impl.MigrationManager: [172.16.100.22]:5701 [dev] [3.7.5] Re-partitioning cluster data... Migration queue size: 271
INFO  2017-02-02 11:13:11,419 [hz._hzInstance_1_dev.migration] com.hazelcast.internal.partition.impl.MigrationThread: [172.16.100.22]:5701 [dev] [3.7.5] All migration tasks have been completed, queues are empty.
INFO  2017-02-02 11:17:35,628 [hz._hzInstance_1_dev.cached.thread-4] com.hazelcast.nio.tcp.TcpIpConnectionManager: [172.16.100.22]:5701 [dev] [3.7.5] Established socket connection between /172.16.100.22:35682 and /172.16.100.25:5701
INFO  2017-02-02 11:17:35,639 [hz._hzInstance_1_dev.cached.thread-2] com.hazelcast.cluster.impl.TcpIpJoiner: [172.16.100.22]:5701 [dev] [3.7.5] [172.16.100.25]:5701 should merge to this node , because : node.getThisAddress().hashCode() < joinMessage.address.hashCode() , this node data member count: 2
INFO  2017-02-02 11:17:35,640 [hz._hzInstance_1_dev.cached.thread-4] com.hazelcast.nio.tcp.TcpIpConnectionManager: [172.16.100.22]:5701 [dev] [3.7.5] Established socket connection between /172.16.100.22:39616 and /172.16.100.24:5701
INFO  2017-02-02 11:17:35,793 [hz._hzInstance_1_dev.IO.thread-in-1] com.hazelcast.nio.tcp.TcpIpConnection: [172.16.100.22]:5701 [dev] [3.7.5] Connection[id=8, /172.16.100.22:39616->/172.16.100.24:5701, endpoint=[172.16.100.24]:5701, alive=false, type=MEMBER] closed. Reason: Connection closed by the other side
INFO  2017-02-02 11:17:35,796 [hz._hzInstance_1_dev.IO.thread-in-2] com.hazelcast.nio.tcp.TcpIpConnection: [172.16.100.22]:5701 [dev] [3.7.5] Connection[id=7, /172.16.100.22:35682->/172.16.100.25:5701, endpoint=[172.16.100.25]:5701, alive=false, type=MEMBER] closed. Reason: Connection closed by the other side
INFO  2017-02-02 11:17:35,806 [hz._hzInstance_1_dev.IO.thread-Acceptor] com.hazelcast.nio.tcp.SocketAcceptorThread: [172.16.100.22]:5701 [dev] [3.7.5] Accepting socket connection from /172.16.100.25:50497
INFO  2017-02-02 11:17:35,808 [hz._hzInstance_1_dev.cached.thread-2] com.hazelcast.nio.tcp.TcpIpConnectionManager: [172.16.100.22]:5701 [dev] [3.7.5] Established socket connection between /172.16.100.22:5701 and /172.16.100.25:50497
INFO  2017-02-02 11:17:35,812 [hz._hzInstance_1_dev.IO.thread-Acceptor] com.hazelcast.nio.tcp.SocketAcceptorThread: [172.16.100.22]:5701 [dev] [3.7.5] Accepting socket connection from /172.16.100.24:47948
INFO  2017-02-02 11:17:35,813 [hz._hzInstance_1_dev.cached.thread-2] com.hazelcast.nio.tcp.TcpIpConnectionManager: [172.16.100.22]:5701 [dev] [3.7.5] Established socket connection between /172.16.100.22:5701 and /172.16.100.24:47948
INFO  2017-02-02 11:17:41,817 [hz._hzInstance_1_dev.priority-generic-operation.thread-0] com.hazelcast.internal.cluster.ClusterService: [172.16.100.22]:5701 [dev] [3.7.5] 

Members [4] {
    Member [172.16.100.22]:5701 - 7cf49a37-03e3-4d58-8490-a4c1a39e6c51 this
    Member [172.16.100.23]:5701 - 23b5d6af-8741-4dd4-9704-01a3debfeb98
    Member [172.16.100.25]:5701 - 44f4d062-efa1-44b2-a262-8235400a3c80
    Member [172.16.100.24]:5701 - afe26ebe-0ab8-441c-8c06-3ec7aa88ca0d
}

Thanks - Juan

Vassilis Bekiaris

unread,

Feb 2, 2017, 3:25:08 PM2/2/17

to haze...@googlegroups.com

Hi Juan,

there are 2 properties which are relevant for your use case:

hazelcast.merge.first.run.delay.seconds --> specify the initial delay after member startup until the split brain handler starts its first execution
hazelcast.merge.next.run.delay.seconds --> determines the interval between each subsequent execution of the split brain handler

Lowering hazelcast.merge.next.run.delay.seconds will enable your member to detect a cluster to merge to more quickly, at the expense of spending some extra cycles executing the handler more frequently.

See more in the properties documentation section: http://docs.hazelcast.org/docs/3.7/manual/html-single/index.html#system-properties

Cheers!
Vassilis

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at https://groups.google.com/group/hazelcast.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/126061d1-e59f-48ca-a816-921b09a29def%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages