adding new server/node/zookeeper to storm

2,218 views
Skip to first unread message

tony

unread,
Jul 13, 2012, 6:12:55 AM7/13/12
to storm...@googlegroups.com
Hi 

one more question I can't seem to find the answer for altough I have been searching quite a lot. 
let say we have a storm cluster 1 nimbus server (with zookeeper) and 2 servers (supervisor + zookeeper on each server).

Now If I want to extend my cluster dynamically without shutting down the cluster  by adding a new server (zookeeper/zeroMq/supervisor), how do I add the new server and let the nimbus know about it. 

on the initial setup (1 nimbus + 2 servers) the config file (storm.yaml) contains the IP address of each server but I don't understand when adding a new server how to reconfigure the files and tell the nimbus to start monitoring them?
Or maybe I don't need to reconfigure the storm.yaml on each initial server, but when I set up a new server and point the "initial" server IP addresses in the storm.yaml the new node will automatically let itself know? and when I run the rebalance command the topologies will get rebalance?

It would be great to have a documentation on the wiki to explain the process and explain or not if we need to shut down the cluster or not when adding new server to the cluster

Thank you,

bm

unread,
Jul 13, 2012, 7:15:32 AM7/13/12
to storm...@googlegroups.com
I may be wrong, but I thought that's what the rebalance command is for.

tony

unread,
Jul 13, 2012, 7:23:47 AM7/13/12
to storm...@googlegroups.com
Well, what I'm looking for is if I add a new server, do you I need to edit the nimbus 's storm.yaml  and the "current" supervisor's yaml config file? and If I do a rebalance will the nimbus and "initial or current" supervisor will pick up the new config file and pick up the new server?

Ashley Brown

unread,
Jul 13, 2012, 7:28:27 AM7/13/12
to storm...@googlegroups.com
Hi Tony,

You're confusing zookeepers and supervisors.

ZooKeeper doesn't currently support dynamically changing cluster sizes - you should have a set of zookeeper servers which are static. These are placed into storm.yaml on nimbus and supervisor nodes.

If you want to add a new supervisor, just set the storm.yaml as on your existing supervisors and nimbus will pick it up (the supervisors register themselves with ZooKeeper, which is where Nimbus finds them). The rebalance command will then spread a topology out onto the new supervisor.

Ashley

--
Dr Ashley Brown
Chief Architect

e: ash...@spider.io
a: spider.io, 10 Warwick St, London, W1B 5LZ
w: http://spider.io/

tony

unread,
Jul 13, 2012, 7:43:45 AM7/13/12
to storm...@googlegroups.com
Hi Ashley,

Thanks for the reply.. After re-reading your post few times and thinking about it, I think I now understand :) The setup I have at the moment, 1 supervisor is setup and register to the zookeeper on the same machine. So if I want to scale up with a new server let say, then I can't just copy the a supervisor machine instance and start it. I need to have this "new" zookeeper already registered with the nimbus first otherwise there wont' be any discovery.

 

Ashley Brown

unread,
Jul 13, 2012, 7:59:03 AM7/13/12
to storm...@googlegroups.com
Hi Tony,

You should think of your ZooKeeper ensemble as a separate entity. You can't just add new ZooKeepers (see https://issues.apache.org/jira/browse/ZOOKEEPER-107). It sounds as though you're currently using ZooKeepers which aren't communicating with each other, which makes me wonder how you have Storm working at all! 

You should have a fixed number of ZooKeepers, configured to communicate with each other as an ensemble. Supply the hostnames of these ZooKeepers in storm.yaml. I'll paste a copy of our configuration below. The same configuration is used on Nimbus and Supervisor instances. We have a nimbus (nimbus00), three supervisors (swarm00-swarm02) and three ZooKeepers (silvermane00-02).

Here's the relevant bit of config:
====

nimbus.host: nimbus00

storm.zookeeper.servers:
    - silvermane00
    - silvermane01
    - silvermane02

<snip>
====

Note that the names of the supervisors are never mentioned. They only need to know where the nimbus is, and where the ZooKeepers are.

For a cluster of your size, where you aren't running ZooKeeper for anything else, you can probably get away with running a single ZK on the nimbus instance (if you don't mind the cluster going down when ZK goes down). If you want redundancy you need at least 3 ZK instances, as they need to maintain a working majority of nodes (you need   (n/2) + 1 nodes to be up -- that doesn't work with < 3).

A

On 13 July 2012 12:43, tony <ant...@infectiousmedia.com> wrote:
Hi Ashley,

Thanks for the reply.. After re-reading your post few times and thinking about it, I think I now understand :) The setup I have at the moment, 1 supervisor is setup and register to the zookeeper on the same machine. So if I want to scale up with a new server let say, then I can't just copy the a supervisor machine instance and start it. I need to have this "new" zookeeper already registered with the nimbus first otherwise there wont' be any discovery.

 



Ashley Brown

unread,
Jul 13, 2012, 8:00:22 AM7/13/12
to storm...@googlegroups.com
And you should probably read the ZooKeeper cluster setup guide:


On 13 July 2012 12:43, tony <ant...@infectiousmedia.com> wrote:
Hi Ashley,

Thanks for the reply.. After re-reading your post few times and thinking about it, I think I now understand :) The setup I have at the moment, 1 supervisor is setup and register to the zookeeper on the same machine. So if I want to scale up with a new server let say, then I can't just copy the a supervisor machine instance and start it. I need to have this "new" zookeeper already registered with the nimbus first otherwise there wont' be any discovery.

 
Reply all
Reply to author
Forward
0 new messages