Hazelcast 2.0: Maximum limit of backup count

Md Kamaruzzaman

unread,

Mar 6, 2012, 9:52:09 AM3/6/12

to Hazelcast

What is the maximum limit of backup count? What is the effect of
setting the backup count high e.g. backup count 100 for 101 node? What
will happen if backup count is higher than total number of nodes?

Thanks,
Md Kamaruzzaman

Mehmet Dogan

unread,

Mar 6, 2012, 10:20:11 AM3/6/12

to haze...@googlegroups.com

Maximum backup count is limited to 6. Setting bigger than 6 has no effect.

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To post to this group, send email to haze...@googlegroups.com.
To unsubscribe from this group, send email to hazelcast+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/hazelcast?hl=en.

Tsai Li Ming

unread,

Mar 6, 2012, 8:39:10 PM3/6/12

to haze...@googlegroups.com

Where is this limit set in the source code?

Also, what was the rational for limiting to 6 backup copies?

Regards,
Liming

Talip Ozturk

unread,

Mar 7, 2012, 3:03:27 AM3/7/12

to haze...@googlegroups.com

> Where is this limit set in the source code?

PartitionInfo.MAX_REPLICA_COUNT default is 7. 1 for owner 6 for backups.

> Also, what was the rational for limiting to 6 backup copies?

For me, even 6 backup is too much. You should consider using NearCache instead.

PartitionInfo.MAX_REPLICA_COUNT can be configurable in the future but
lower the number, better. Because migration process in Hazelcast will
migrate each replica separately and higher replica-count can lead to
much longer migration times. For smoother performance keep that number
as low as possible.

http://twitter.com/oztalip

Md Kamaruzzaman

unread,

Mar 7, 2012, 7:53:28 AM3/7/12

to Hazelcast

Actually, the purpose of backup is different than that of NearCache.
Anyway, is there any master-slave in backup? I have a cluster of A(:
8540), B(:8550), C(:8560)
with backup count-2. Now, a new node is added connected to A. Now A is
closed and
the following message is get in the new node:

Warnung: /127.0.0.1:8570 [dev] Received a ClusterRuntimeState, but its
sender doesn't seem master! => Sender: Address[127.0.0.1:8550],
Master: Address[127.0.0.1:8540]! (Ignore if master node has changed
recently.)

It would be nice if you can clear the Master Node concept in the above
scenario.

Thanks,
Md Kamaruzzaman

Talip Ozturk

unread,

Mar 7, 2012, 8:23:42 AM3/7/12

to haze...@googlegroups.com

> Anyway, is there any master-slave in backup? I have a cluster of A(:
> 8540), B(:8550), C(:8560)
> with backup count-2. Now, a new node is added connected to A. Now A is
> closed and
> the following message is get in the new node:
>
> Warnung: /127.0.0.1:8570 [dev] Received a ClusterRuntimeState, but its
> sender doesn't seem master! => Sender: Address[127.0.0.1:8550],
> Master: Address[127.0.0.1:8540]! (Ignore if master node has changed
> recently.)
>
> It would be nice if you can clear the Master Node concept in the above
> scenario.

There is no master-slave in backup. Warning message has nothing to do
with backups. It is for us to double check if the cluster-state is
right. From your scenario, things look very good. Master in this
message means 'the oldest member in the cluster'. A was the oldest
member and you closed it so as the warning message says you can ignore
the message.

-talip

Md Kamaruzzaman

unread,

Mar 7, 2012, 8:38:22 AM3/7/12

to Hazelcast

Thanks for the clarification. In Hazelcast 2.0, the backup strategy is
completely changed.
It would be nice to have more explanation about the new strategy. I
have the following scenario:

backup-count = 2
Node A is up
Node B is up and A-B are connected.
Node C is up and A-B-C are connected.

Now, A has two backup in B and in C.
B has two backup in A and in C.
C has two backup in A and in B.

Now if node D is connected in the cluster with node A, then in my
opinion there will be no backup
data in node D as node A,B and all have reached the backup-count (2).
But I can see backup data
in node D as follows:

Number of Entries: 100
Own Entry: 24, Backup Entry: 39

It would be nice if you could kindly explain this.

Thanks,
Md Kamaruzzaman

Talip Ozturk

unread,

Mar 7, 2012, 9:40:16 AM3/7/12

to haze...@googlegroups.com

> It would be nice to have more explanation about the new strategy. I
> have the following scenario:
>
> backup-count = 2
> Node A is up
> Node B is up and A-B are connected.
> Node C is up and A-B-C are connected.
>
> Now, A has two backup in B and in C.
> B has two backup in A and in C.
> C has two backup in A and in B.
>
> Now if node D is connected in the cluster with node A,
> then in my opinion there will be no backup data in node D as node A,B and all have reached the backup-count (2).

Not correct! First of all backup-count doesn't dictate the number of
nodes backup data will be on.
Even if you had backup-count=1, A -can- be backed up on B, C and D!!!
backup-count 1 means there will be
only one backup copy on the other nodes.

> But I can see backup data
> in node D as follows:
>
> Number of Entries: 100
> Own Entry: 24, Backup Entry: 39

Normal. 39 backup entries of A, B, and C. Each node potentially can
have backup entries of every other node in the cluster.

Say you have 2TB data on 50 node cluster; each node storing 20GB
primary, 20GB backup data (assuming backup-count=1). Let's focus on
one of the nodes, say node3. 20GB primary data that node3 has will be
backed up by all 49 other nodes each backing up 1/49th of 20GB. If
node3 dies, each member will own 1/49th of its data; notice that no
migration is needed and cluster is still well-balanced! So backup
mechanism is designed in a way that there will no need to rebalance
after crashes. Say you added 5 more nodes. There is no immediate
action to be taken by the cluster; Hazelcast will slowly migrate some
of the data (primary and backup) to the new nodes.

http://twitter.com/oztalip

Md Kamaruzzaman

unread,

Mar 7, 2012, 11:34:30 AM3/7/12

to Hazelcast

So, in case Node 1 and 2 are stopped simultaneously, then the cluster
will loose 1/49th of node 1 data (backed up by node 2)
and 1/49th of node 2 data (backup up by node 1). So, how can backup
count help in this case to avoid the data loss?
It would be helpful if you explain the 50 node scenario with backup
count 2.

Thanks,
Md Kamaruzzaman

Talip Ozturk

unread,

Mar 7, 2012, 11:51:24 AM3/7/12

to haze...@googlegroups.com

If the backup-count is 2 then each node will backup 2/49th of the
records owned by node1 and 2/49th of the records owned by node2.

Also note that if you have backup-count = 2 then there will be 3
copies of each entry, one primary and two backups. That didn't change
at all. Only backup distribution logic changed.

-talip

Tsai Li Ming

unread,

Mar 7, 2012, 11:59:45 AM3/7/12

to haze...@googlegroups.com

If a node is stopped, will hazelcast automatically start to create new backup copies since the node that holds its primary and backup copies are gone?

Talip Ozturk

unread,

Mar 7, 2012, 12:03:59 PM3/7/12

to haze...@googlegroups.com

Yes. Remaining backup node will be the new owner entries. First backup
will be taken immediately, and the second backup will be scheduled.

-talip

Md Kamaruzzaman

unread,

Mar 7, 2012, 12:06:53 PM3/7/12

to Hazelcast

So, there is always a chance of loosing partial data when several
nodes are stopped
simultaneously. In my analysis, when the number of simultaneously
stopped node
is greater than backup-count, then we will loose some data. It would
be nice if
you make a detailed documentation of backup in future release.

Many thanks.
-Md Kamaruzzaman

Reply all

Reply to author

Forward