Hazelcast on EC2

22 views
Skip to first unread message

tim_darby

unread,
Dec 11, 2009, 7:03:02 AM12/11/09
to Hazelcast
Hi,

Just posting up to get thoughts and/or potential pitfalls here from
the crowd.

Basically, I'm looking at getting Hazelcast setup on an EC2 based
cluster (leaving aside any problems of beign partially off EC2 for now
for simplicity), but without having to hardcode a specific 'master'
IP. This is because I want to be as close to 'truly' peer to peer as
possible.

In order to achieve this, what I'm doing on app startup, is first
polling using the EC2 APIs for machine instance IDs matching a
particular spec (ami-id, group), and programmatically amending a
Hazelcast Config object to ensure multicast is off, and put all the
known matching instances' IP addresses (should be private IPs but for
reason's unknown to me its not, so right now its their public IPs)
into the pre-configured TCP hosts list.

(Its put together cludgily right now, but once I've refactored
slightly into something which essentially looks like
'Hazelcast.newInstance(Config)', but instead is
'EC2HazelcastFactor.newInstance(Config, <ami id>, <group id>)', I'll
post it up for wider benefit.

Now this seems to work, but my question is, can anyone think of any
big phat failures in this cunning plan.

As far as I can tell, this should work and not get hazelcast into a
'split brain' scenario unless I somehow manage to make 2 independent
requests to create EC2 instances (one request for multiple nodes seems
to be immediately visible in the EC2 API getInstances call), such that
when each node comes up, each one believes they are the only node
which should be in the cluster (no multicast so no 'discovery'
possible, darn you Amazon).

What may happen over time is that different nodes in the cluster would
have different preconfigured lists of hosts in their config, but given
that when a new EC2 instance comes up, the list of instances it gets
from Amazon will be the *current* instances, so as long as there's
one of those nodes still good to go, it will connect initially to
that.

Go on - it can't be that simple, someone tell me the major fail I'm
creating for myself :D

Tim

Nic Pottier

unread,
Dec 11, 2009, 3:40:38 PM12/11/09
to Hazelcast

Hiya Tim,

I'm going about the same thing..

One questions is whether addressing your public IP from inside EC2
will trigger you paying for that transfer.

If the answer to the above is no, then another simpler strategy might
be to just reserve yourself some static IP's and always associate your
servers with them when you bring them up, and stick those static IPs
in the hazelcast configs.

But I do like your approach as it sounds nice and simple... I does
seem to me like it would work fine except for the a race scenario.

Another possibility I just thought of...

In my particular setup, I have a static public IP that is facing the
world and which will 'always' be up, either associated to a single
instance which is the only one, or associated to a load balancer
(possibly AMZN's). Could you just set that IP as your master?
Wouldn't Hazelcast then always communicate the other nodes that have
connected? Does it matter whether that IP is the 'real' master (ie,
if that IP is actually a LB?)..

Anyways, sign me up for EC2Hazelcast.newInstance() :)

-Nic

On Dec 11, 4:03 am, tim_darby <tim_da...@scee.net> wrote:
> Hi,
>
> Just posting up to get thoughts and/or potential pitfalls here from
> the crowd.
>
> Basically, I'm looking at getting Hazelcast setup on anEC2based
> cluster (leaving aside any problems of beign partially offEC2for now
> for simplicity), but without having to hardcode a specific 'master'
> IP. This is because I want to be as close to 'truly' peer to peer as
> possible.
>
> In order to achieve this, what I'm doing on app startup, is first
> polling using theEC2APIs for machine instance IDs matching a
> particular spec (ami-id, group), and programmatically amending a
> Hazelcast Config object to ensure multicast is off, and put all the
> known matching instances' IP addresses (should be private IPs but for
> reason's unknown to me its not, so right now its their public IPs)
> into the pre-configured TCP hosts list.
>
> (Its put together cludgily right now, but once I've refactored
> slightly into something which essentially looks like
> 'Hazelcast.newInstance(Config)', but instead is
> 'EC2HazelcastFactor.newInstance(Config, <ami id>, <group id>)', I'll
> post it up for wider benefit.
>
> Now this seems to work, but my question is, can anyone think of any
> big phat failures in this cunning plan.
>
> As far as I can tell, this should work and not get hazelcast into a
> 'split brain' scenario unless I somehow manage to make 2 independent
> requests to createEC2instances (one request for multiple nodes seems
> to be immediately visible in theEC2API getInstances call), such that
> when each node comes up, each one believes they are the only node
> which should be in the cluster (no multicast so no 'discovery'
> possible, darn you Amazon).
>
> What may happen over time is that different nodes in the cluster would
> have different preconfigured lists of hosts in their config, but given
> that when a newEC2instance comes up, the list of instances it gets
Reply all
Reply to author
Forward
0 new messages