Hi guys,
I have a simple topology that puts values on a Redis server.
When I deploy it, no client gets to connect Redis, although it works perfectly fine when I run it via LocalCluster.
I attached logs and conf for the nimbus and the supervisors.
The cluster is configured as following:
- 1GB RAM for the nimbus (192.168.1.22)
- 1GB RAM for the zookeeper1 (192.168.1.31)
- 2GB RAM for the supervisor1 (192.168.1.16; 4 workers)
- 2GB RAM for the supervisor2 (192.168.1.19; 2 workers)
All machines are virtual and have JDK 6u33 x64 installed.
nimbus, supervisor1 & supervisor2 have Storm 0.8.0, ZeroMQ 2.1.7 and the latest JZMQ installed.
zookeeper1 has Python 2.6.6 (with default configuration) and Zookeeper 3.3.6 installed.
I'm not sure this is the entire problem, but I'm getting the following exception on some of my supervisors (in our case - supervisor2):
2012-08-10 08:21:27 worker [ERROR] Error on initialization of server mk-worker
java.io.FileNotFoundException: File '/opt/storm/local/supervisor/stormdist/DistributedSystem-1-1344586762/stormconf.ser' does not exist
at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:137)
at org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1135)
at backtype.storm.config$read_supervisor_storm_conf.invoke(config.clj:138)
at backtype.storm.daemon.worker$worker_data.invoke(worker.clj:146)
at backtype.storm.daemon.worker$fn__4316$exec_fn__1206__auto____4317.invoke(worker.clj:331)
at clojure.lang.AFn.applyToHelper(AFn.java:185)
at clojure.lang.AFn.applyTo(AFn.java:151)
at clojure.core$apply.invoke(c ore.clj:601)
at backtype.storm.daemon.worker$fn__4316$mk_worker__4372.doInvoke(worker.clj:322)
at clojure.lang.RestFn.invoke(RestFn.java:512)
at backtype.storm.daemon.worker$_main.invoke(worker.clj:432)
at clojure.lang.AFn.applyToHelper(AFn.java:172)
at clojure.lang.AFn.applyTo(AFn.java:151)
at backtype.storm.daemon.worker.main(Unknown Source)
2012-08-10 08:21:27 util [INFO] Halting process: ("Error on initialization")
The topology I'm trying to run requires 4 workers altogether.
So even if supervisor2 dysfunctions, the other supervisor should be able to run the entire topology on its own.
Am I doing something wrong here?
Thanks,
Moshe.