Partial Amazon EC2 cluster config

98 views
Skip to first unread message

Alex Parvulescu

unread,
Oct 26, 2009, 12:26:37 PM10/26/09
to haze...@googlegroups.com
hello,

I have a question about hazelcast and Amazon EC2 instances.

What I'd like to build is a EC2 on-demand slave cluster. I'd like to
have a fixed cluster of servers, and if needed add more slaves from
Amazon, then shutting them down.

The problem that I had is my local server (let's call it the master,
having a public ip address) doesn't seem to be able to connect to the
EC2 slave. The EC2 machine has an IP that is not available from the
outside world, and does not seem to be aware of the public ip ( either
the amazon machine name you use for ssh, OR the elastic IP)
If I type ifconfig on the EC2 I see only one interface, the internal IP one.

I've seen some emails about configuring EC2 instances, but that
assumes the cluster in entirely on Amazon. In my usecase I'd rather
use Amazon only for some processing, but no the entire infrastructure.

I think this is very similar to trying to use Hazelcast on a computer
that is behind a router: the connection seems to work, the 2 computers
'link' with each other, but the link soon fails as it's impossible to
sync properly.

Do you think it is possible to setup something like this?
I'm not sure if this is even possible, so any thoughts are welcome!

thanks,
alex

Talip Ozturk

unread,
Oct 26, 2009, 5:34:16 PM10/26/09
to haze...@googlegroups.com
Alex,

I just want to make sure I understand. So please correct me if the
following statements are wrong:

1. You want to setup dynamically scaling (up/down) Hazelcast cluster on EC2.
2. You still want to have your own application in your own datacenter.
3. Your application should be able to talk to the cluster on EC2 when
it needs processing power.

Did I get it right?
-talip

Alex Parvulescu

unread,
Oct 27, 2009, 4:16:18 AM10/27/09
to haze...@googlegroups.com
yes Talip,

that is right

Does it make sense to use something like Hazelcast? It seems a bit
complicated to me, but I am not an expert.

thanks,
alex

Kevin C. Dorff

unread,
Oct 27, 2009, 10:34:55 AM10/27/09
to haze...@googlegroups.com
Alex,

I haven't specifically done what you are asking but I think there are
two issues...

First, I assume you will run the "primary" (always locatable)
Hazelcast server(s) within your data center.

The first will be latency - it will take longer to communicate between
machines in your data center and EC2 but assuming Hazelcast can handle
this latency it probably isn't a big deal. You may need to research
timeout details with Hazelcast.

The second, larger, but maybe easier to solve issue is Firewalling and
security of communication. You will need to allow incoming connections
(on the Hazelcast port(s)) from Amazon EC2 servers and those servers
will change with every new cluster you make, I suspect. Secondly, you
will need to allow data from your data on the Hazelcast port(s) center
into the EC2 machines (EC2 firewall rules). Finally, with security, I
would think you will want the strongest encryption, non-trivial
usernames, and passwords on your Hazelcast communication - at least
beyond the initial testing phase.

Hope some of this helped. I imagine Talip has greater insights into
this than I do.

Kevin

Alex Parvulescu

unread,
Oct 27, 2009, 12:17:23 PM10/27/09
to haze...@googlegroups.com
Hello Kevin and thanks for the reply,

you do bring up some interesting issues, but I am only at step 1 of
the test - let's call it a 'hello world!' application.

this doesn't work, as Hazelcast is having troubles connecting the 2
computers ( the one from my datacenter and the one from amazon).

Let me compare the two, maybe it will shed more light on the situation.

The 'local' server has its own public IP ( I run an 'ifconfig' I can
see the internal IP and the external one, I see all the interfaces the
server has, this way I can tell hazelcast to pick the good interface
when it starts).

On the other side of the fence, as far as I can see, the amazon server
has only one interface, the internal IP one, and hazelcast will start
using the internal IP. What happens next is the 2 servers discover
each-other, but fail to communicate, because of the amazon internal
IP.
You can assign an elastic IP to that machine, and use that as a
reference in the hazelcast.xml config file on the 'local' server. But
you cannot start hazelcast on the public interface because that is not
available on the computer, I guess it works like a router inside
amazon, all the requests coming on that IP will be redirected to that
machine, but the machine is unaware of that IP locally.

I may be off here, so please take this with a gain of salt.

thanks,
alex

Kevin C. Dorff

unread,
Oct 27, 2009, 1:06:51 PM10/27/09
to haze...@googlegroups.com
First, I think we need to determine what your desired architecture is. Is it

* Always running machine in your data center, users only talk to your
machine(s) in your datacenter. You might have multiple machines within
your data center running the app, but pick one machine as the
"primary" node that runs Hazelcast (let's call this machine MASTER).
* Sometimes running machines in EC2 to do some processing when work
increases. Probably this is a work queue type system?

Assuming that is the case. I would take the following steps

1. Write a tiny app (Groovy would be my choice) that starts hazelcast,
say creates a Hazelcast queue then loops forever. The goal here is
just to get hazelcast running on a machine in your datacenter

2. Fire up an EC2 instance. From the EC2 instance you should be able
to telnet to the machine you defined above as master. I would telnet
by IP address to avoid name resolution issues.

telnet MASTER 5701

If you can connect via telnet to your data center machine to port
5701 via telnet that means the firewalls on your datacenter and on ec2
are setup correctly. If you cannot connect via telnet, you have made a
mistake in your firewall configurations. No sense in proceeding until
the firewall issues are ironed out.

3. Do not proceed until #2 is working correct.

4. On all of your hazelcast.xml files, you just need to point to the
single "master" node that I described at the top of this message.

5. Run that same tiny app I mentioned in #1 and you should see the two
programs connect up.

I am not saying my hazelcast.xml that I use for EC2 will work for you,
as I ONLY use it within EC2, but what I use is (noting that
${MASTER_HOST} is defined when the file is written):

-------- start of file -------

<hazelcast>
<group>
<name>USERNAME</name>
<password>PASSWORD</password>
</group>
<network>
<port auto-increment="true">5701</port>
<interfaces enabled="true">
<interface>10.*.*.*</interface>
</interfaces>
<join>
<multicast enabled="false">
<multicast-group>224.2.2.3</multicast-group>
<multicast-port>54327</multicast-port>
</multicast>
<tcp-ip enabled="true">
<hostname>${MASTER_HOST}</hostname>
</tcp-ip>
</join>
</network>
<map name="default">
<backup-count>1</backup-count>
<eviction-policy>NONE</eviction-policy>
<max-size>0</max-size>
<eviction-percentage>25</eviction-percentage>
</map>
</hazelcast>

-------- end of file -------

On Tue, Oct 27, 2009 at 12:17 PM, Alex Parvulescu

Talip Ozturk

unread,
Oct 27, 2009, 2:27:30 PM10/27/09
to haze...@googlegroups.com
Alex,

If all you want to do :
1. You want to setup dynamically scaling (up/down) Hazelcast cluster on EC2.
2. You still want to have your own application in your own datacenter.
3. Your application should be able to talk to the cluster on EC2 when
it needs processing power.

then

creating a cluster between your datacenter and EC2 may not even be
necessary. Here is how I would do:
1. Create your Hazelcast cluster only on EC2.
2. Get a load-balancer public IP for your cluster (EC2 provides this service)
3. Your application should access the cluster via Hazelcast Java Client.

so you will have clean client-server architecture where the server
part is actually scalable cluster. Your application in your data
center will connect to the Hazelcast cluster on EC2 via EC2 public IP.
As long as there is at least one node running in your EC2 cluster,
client will keep working just fine.

Regards,
-talip

Alex Parvulescu

unread,
Oct 28, 2009, 6:32:03 AM10/28/09
to haze...@googlegroups.com
hello,

thanks a lot guys for you help!!

good catch Kevin on that telnet thing. I'd like to add more to that:
telnet from amazon to the master server AND telnet from the master
server to the amazon server. It seems you have to enable the 5701 port
on the security groups on Amazon.

In the hazelcast config file I have only the master server's IP, but
this means that the master server must be first online - I guess this
is an obvious thing, but worth mentioning anyways.

thanks again Talip and Kevin!

congrats on realeasing 1.7.1, I'll be taking that for a spin very soon :)

alex

Alex Parvulescu

unread,
Oct 28, 2009, 8:14:57 AM10/28/09
to haze...@googlegroups.com
hello guys,

I did a small blog post about this test:
http://pfa-labs.blogspot.com/2009/10/hazelcast-working-with-partial-amazon.html

Kevin, if its ok with you, can you send me a link to your blog, I'd
like to link to it from the blog post.

thanks,
alex

On Wed, Oct 28, 2009 at 11:32 AM, Alex Parvulescu

Talip Ozturk

unread,
Mar 29, 2010, 4:39:01 AM3/29/10
to haze...@googlegroups.com
HI Alex,

We are working on Hazelcast Testimonials. We would like to include a
quote from you if possible.

Can you please say couple of words on Hazelcast?

Thanks,
-talip


On Wed, Oct 28, 2009 at 3:14 PM, Alex Parvulescu

> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google Groups "Hazelcast" group.
> To post to this group, send email to haze...@googlegroups.com
> To unsubscribe from this group, send email to hazelcast+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/hazelcast?hl=en
> -~----------~----~----~----~------~----~------~--~---
>
>

Reply all
Reply to author
Forward
0 new messages