Node discovery via S3 "blackboard"

222 views
Skip to first unread message

Tim Peierls

unread,
Feb 16, 2011, 2:59:24 PM2/16/11
to Hazelcast
For those who are interested in running a Hazelcast cluster as part of
an Amazon Web Services Elastic Beanstalk application, I am happy to
report that I've had good preliminary results using a "blackboard"
approach with S3. As others have noted, AWS EC2 does not support
multicast, so you have to use TCP/IP. In Elastic Beanstalk, however,
you don't know the IP addresses of the participants in advance, and
there is no well-known "master" address that you can rely on, so there
needs to be a discovery phase. At application startup, before creating
the HazelcastInstance, I take the following steps:

1. Sign in, i.e., write this host's IP address as an object to a known
directory in S3 (specific to the application's environment).
2. Wait a little bit for any other potential cluster members being
started at roughly the same time to write their IP addresses.
3. Read all the IP addresses out of the S3 directory and use these
values in the Config object.
4. Use the Config object to create the HazelcastInstance.
5. Register a LifecycleListener with the LifecycleService associated
with the newly created HazelcastInstance.

The listener waits for a SHUTDOWN state and then signs out, i.e.,
removes its IP address from the S3 directory. I'm planning a
refinement that would add a timestamp to each address and periodic
refreshing of the sign in so that addresses that were not signed out
properly could be culled.

The IP addresses that are written are the EC2 "private" IP addresses,
which are afaict not routable outside of the EC2 zone, but are
routable within it.

This appears to work well, but I confess I've only tried it with up to
three members. My question is whether anyone sees any potential
pitfalls with this approach, and if so, how to get around them.

--tim

Fuad Malikov

unread,
Feb 17, 2011, 3:27:46 AM2/17/11
to haze...@googlegroups.com, Tim Peierls
Hi Tim,

This sounds ok. Thanks for sharing. We also did a similar thing and could run up to 100 node. The only thing I don't like with this and our own approach is that the nodes should contain the AWS credentials to access S3.

Here is the article describing how we did it. 


Cheers...

Fuad


--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To post to this group, send email to haze...@googlegroups.com.
To unsubscribe from this group, send email to hazelcast+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/hazelcast?hl=en.




--

@fuadm

Tim Peierls

unread,
Feb 17, 2011, 8:09:42 AM2/17/11
to Fuad Malikov, haze...@googlegroups.com
On Thu, Feb 17, 2011 at 3:27 AM, Fuad Malikov <fu...@hazelcast.com> wrote:
This sounds ok. Thanks for sharing. We also did a similar thing and could run up to 100 node. The only thing I don't like with this and our own approach is that the nodes should contain the AWS credentials to access S3.

Elastic Beanstalk has a convenient facility for letting you provide the AWS credentials as system properties, defined in the application environment via your AWS console, not in the nodes themselves. This way the WAR file that you upload doesn't need to contain any sensitive information.


Here is the article describing how we did it. 


This was the article that encouraged to explore this path in the first place -- thanks! (I probably should have mentioned that in my first post, sorry. Hope this makes up for it.) Alas, Elastic Beanstalk limits you to 4 nodes in the free beta. :-( 

I couldn't use exactly the same approach for node discovery, though, because I'm working through the Elastic Beanstalk interface and don't have programmatic access to the private IP addresses of the nodes.

--tim

James Cook

unread,
May 2, 2011, 10:50:40 AM5/2/11
to haze...@googlegroups.com, Fuad Malikov
Elastic Search uses the EC2 API to autodiscover nodes in EC2 where multicast is not supported. It would be nice to see Hazelcast adopt this approach.

To this end, I have an Elastic Search instance _and_ Hazelcast instance embedded in each of my web server instances. Some time after Hazelcast is instantiated, I would like to query my Elastic Search network for available nodes. Then I plan to programmatically add the IP address of the other nodes to Hazelcast.

My question is whether an "unclustered" single Hazelcast node can be programmatically told to connect to a new node on the network after it already has been instantiated?

-- jim


--

Talip Ozturk

unread,
May 2, 2011, 11:12:46 AM5/2/11
to haze...@googlegroups.com
> My question is whether an "unclustered" single Hazelcast node can be
> programmatically told to connect to a new node on the network after it
> already has been instantiated?
> -- jim

You can try this:

// start Hazelcast as single instance
Hazelcat.init(config);

// update config and add know members
config.getNetwork().get...

// now restart the hazelcast instance so that it can join the existing cluster
Hazelcast.getLifecycleService().restart()

http://twitter.com/oztalip

Reply all
Reply to author
Forward
0 new messages