Re: Possible state corruption on supervisor recovery

320 views
Skip to first unread message

Harel Ben Attia

unread,
Jun 19, 2012, 8:34:20 AM6/19/12
to storm...@googlegroups.com, izik.sh...@gmail.com
Hi, 

we've been trying to narrow the issue down, so we've reinstalled the entire cluster machines, along with a new installation of storm 0.7.3 (we were using 0.7.2 before that), and using the ExclamationTopology from storm-starter instead of our own topology.

The issue still happens, even with this example topology, so it doesn't seem to be something related to our own topology.

Machine: Dell C6100 (FZ9TLS1) with 8 cores & 12G of RAM.
OS: CentOS release 5.8 Final, 64 bit
Storm version: 0.7.3
jzmq: 2.1.0

Thanks
RL & Izik




On Sunday, June 17, 2012 12:45:03 PM UTC-4, Harel Ben Attia wrote:
Hi,

We've been consistenly getting the exception below while the supervisor is restarted on a simple cluster in one machine (nimbus+supervisor on the same machine). 

Performing all kinds of cleanup, including cleaning up all the local folders (nimbus + supervisor + worker + tmp) does not alleviate the problem. trying to track the issue,
we've narrowed it down to what seems to be corrupted data inside the localstate/ local folder of the supervisor. However, it seems that the actual data is provided by nimbus, since cleaning up the localstate/ folder before the restart does not help - The data reappears there.

While trying to analyze the issue further, we've disocovered a pattern which might implicate a timing issue. Basically, the supervisor is restarted properly only if we set the root log4j logger to DEBUG instead of INFO. At first we thought that it is incidental, but it seems consistent over multiple days/topology-deployments etc.

Trying to restart all the processes, including nimbus, and restarting the whole cluster does not work as well, and we've even tried deleting the /storm node from zookeeper but the problem still happens. It is important to note that cleaning the entire cluster, including a fresh zookeeper and redpeloying the topology did not work either - The supervisor did not manage to start correctly after a restart and the same exception happened all over again.

As we've mentioned before, the only thing which made the supervisor process start properly was to change the logging level to DEBUG again. 

It is also important to note that once the topology is up and the supervisor is up and running, the topology itself works well.


While trying to analyze the issue, we've saved two instances of the localstate/ data files.
* A data file in the case where the supervisor restarted successfully
* A data file in the case where the supervisor failed to restart with the below exception.

We've attached the files, hopefully it will help to shed some light on this. A copy of the supervisor log file for the crash is also attached, all zipped to a .tar.gz due to google restrictions...

This issue prevents us from going forward with our performance analysis of the cluster, because the only times where we can actually use the topology are when the entire cluster is in debug mode, which definitely skews the results, with all the extra io which is being done for logging.

We'd appreciate any advice on this,

RL and Izik

Here is the exception in the supervisor log:
java.lang.RuntimeException: java.lang.ClassNotFoundException: clojure.core$concat$fn__106
at backtype.storm.utils.Utils.deserialize(Utils.java:61)
at backtype.storm.utils.LocalState.snapshot(LocalState.java:24)
at backtype.storm.utils.LocalState.put(LocalState.java:32)
at backtype.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:230)
at clojure.lang.AFn.applyToHelper(AFn.java:161)
at clojure.lang.AFn.applyTo(AFn.java:151)
at clojure.core$apply.invoke(core.clj:603)
at clojure.core$partial$fn__444.doInvoke(core.clj:2343)
at clojure.lang.RestFn.invoke(RestFn.java:397)
at backtype.storm.event$event_manager$fn__7775.invoke(event.clj:24)
at clojure.lang.AFn.run(AFn.java:24)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.ClassNotFoundException: clojure.core$concat$fn__106
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:603)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1574)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1495)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1731)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1666)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1322)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
at java.util.HashMap.readObject(HashMap.java:1030)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
at backtype.storm.utils.Utils.deserialize(Utils.java:55)
... 11 more

Nathan Milford

unread,
Jun 21, 2012, 6:38:55 PM6/21/12
to storm...@googlegroups.com, izik.sh...@gmail.com, Ha...@outbrain.com
I'm part of the same team as Izik and Harel.  I'm the ops guy :P

I got nothing here.  I can run the example topologies locally, but not distributed.

* I'm now running the zookeeper from CDH4, and have built out and distributed the zeromq and jzmq versions recommended in the wiki.
* I've rebuild the nodes from kickstart to shake out any residual cruft.
* I've ran it as a unpriv'd storm user with open permissions on the local dir, as well as root.
* I've explicitly disabled ipv6 via the childopts, as well as set the nimbus and zookeeper config to be all IPs rather than hostnames to eliminate DNS issues.
* I built RPMs and init scripts for Storm, but to eliminate that as a potential source of issues, I rebuilt the whole 3-node cluster by hand from the binary distro and get the same probs.

For reference, the RPM specs and scritps are here:

What am I missing?

I'll be at Velocity next week.  Happy to buy someone some beer and food for some advice.

(ro...@storm01.nydc1:~)# storm jar /outbrain/storm-starter/target/storm-starter-0.0.1-SNAPSHOT-jar-with-dependencies.jar  storm.starter.ExclamationTopology
Running: java -client -Dstorm.home=/opt/storm-0.7.3 -Dstorm.options= -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.jar=/outbrain/storm-starter/target/storm-starter-0.0.1-SNAPSHOT-jar-with-dependencies.jar -cp /opt/storm-0.7.3/storm-0.7.3.jar:/opt/storm-0.7.3/lib/asm-3.2.jar:/opt/storm-0.7.3/lib/commons-fileupload-1.2.1.jar:/opt/storm-0.7.3/lib/zookeeper-3.3.3.jar:/opt/storm-0.7.3/lib/commons-logging-1.1.1.jar:/opt/storm-0.7.3/lib/tools.logging-0.2.3.jar:/opt/storm-0.7.3/lib/tools.cli-0.2.1.jar:/opt/storm-0.7.3/lib/core.incubator-0.1.0.jar:/opt/storm-0.7.3/lib/math.numeric-tower-0.0.1.jar:/opt/storm-0.7.3/lib/clj-time-0.4.1.jar:/opt/storm-0.7.3/lib/slf4j-api-1.5.8.jar:/opt/storm-0.7.3/lib/tools.macro-0.1.0.jar:/opt/storm-0.7.3/lib/servlet-api-2.5.jar:/opt/storm-0.7.3/lib/log4j-1.2.16.jar:/opt/storm-0.7.3/lib/jetty-6.1.26.jar:/opt/storm-0.7.3/lib/guava-10.0.1.jar:/opt/storm-0.7.3/lib/jsr305-1.3.9.jar:/opt/storm-0.7.3/lib/jzmq-2.1.0.jar:/opt/storm-0.7.3/lib/slf4j-log4j12-1.5.8.jar:/opt/storm-0.7.3/lib/hiccup-0.3.6.jar:/opt/storm-0.7.3/lib/commons-lang-2.5.jar:/opt/storm-0.7.3/lib/servlet-api-2.5-20081211.jar:/opt/storm-0.7.3/lib/curator-client-1.0.1.jar:/opt/storm-0.7.3/lib/curator-framework-1.0.1.jar:/opt/storm-0.7.3/lib/commons-exec-1.1.jar:/opt/storm-0.7.3/lib/commons-codec-1.4.jar:/opt/storm-0.7.3/lib/joda-time-2.0.jar:/opt/storm-0.7.3/lib/reflectasm-1.01.jar:/opt/storm-0.7.3/lib/jline-0.9.94.jar:/opt/storm-0.7.3/lib/ring-core-0.3.10.jar:/opt/storm-0.7.3/lib/minlog-1.2.jar:/opt/storm-0.7.3/lib/kryo-1.04.jar:/opt/storm-0.7.3/lib/commons-io-1.4.jar:/opt/storm-0.7.3/lib/snakeyaml-1.9.jar:/opt/storm-0.7.3/lib/carbonite-1.0.1.jar:/opt/storm-0.7.3/lib/libthrift7-0.7.0.jar:/opt/storm-0.7.3/lib/ring-jetty-adapter-0.3.11.jar:/opt/storm-0.7.3/lib/compojure-0.6.4.jar:/opt/storm-0.7.3/lib/jetty-util-6.1.26.jar:/opt/storm-0.7.3/lib/httpcore-4.1.jar:/opt/storm-0.7.3/lib/httpclient-4.1.1.jar:/opt/storm-0.7.3/lib/ring-servlet-0.3.11.jar:/opt/storm-0.7.3/lib/clojure-1.4.0.jar:/opt/storm-0.7.3/lib/json-simple-1.1.jar:/opt/storm-0.7.3/lib/clout-0.4.1.jar:/opt/storm-0.7.3/lib/junit-3.8.1.jar:/outbrain/storm-starter/target/storm-starter-0.0.1-SNAPSHOT-jar-with-dependencies.jar:/root/.storm:/opt/storm-0.7.3/bin storm.starter.ExclamationTopology 
0    [main] INFO  backtype.storm.zookeeper  - Starting inprocess zookeeper at port 2000 and dir /tmp/04379999-74ae-49ea-95fc-3d048e520ca6
204  [main] INFO  backtype.storm.daemon.nimbus  - Starting Nimbus with conf {"dev.zookeeper.path" "/tmp/dev-storm-zookeeper", "topology.fall.back.on.java.serialization" true, "zmq.linger.millis" 0, "topology.skip.missing.kryo.registrations" true, "ui.childopts" "-Xmx768m", "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true, "nimbus.monitor.freq.secs" 10, "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.local.dir" "/tmp/1a9a2d49-cf65-4ae3-8555-285c510ebcee", "supervisor.worker.start.timeout.secs" 120, "nimbus.cleanup.inbox.freq.secs" 600, "nimbus.inbox.jar.expiration.secs" 3600, "nimbus.host" "localhost", "storm.zookeeper.port" 2000, "transactional.zookeeper.port" nil, "transactional.zookeeper.servers" nil, "storm.zookeeper.root" "/storm", "supervisor.enable" true, "storm.zookeeper.servers" ["localhost"], "transactional.zookeeper.root" "/transactional", "topology.worker.childopts" nil, "worker.childopts" "-Xmx768m", "supervisor.heartbeat.frequency.secs" 5, "drpc.port" 3772, "supervisor.monitor.frequency.secs" 3, "task.heartbeat.frequency.secs" 3, "topology.max.spout.pending" nil, "storm.zookeeper.retry.interval" 1000, "supervisor.slots.ports" [6700 6701 6702 6703], "topology.debug" false, "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60, "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10, "topology.workers" 1, "supervisor.childopts" "-Xmx1024m", "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05, "worker.heartbeat.frequency.secs" 1, "nimbus.task.timeout.secs" 30, "drpc.invocations.port" 3773, "zmq.threads" 1, "storm.zookeeper.retry.times" 5, "topology.state.synchronization.timeout.secs" 60, "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs" 600, "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false, "ui.port" 8080, "nimbus.childopts" "-Xmx1024m", "topology.ackers" 1, "storm.cluster.mode" "local", "topology.optimize" true, "topology.max.task.parallelism" nil}
236  [main] INFO  com.netflix.curator.framework.imps.CuratorFrameworkImpl  - Starting
281  [main-EventThread] INFO  backtype.storm.zookeeper  - Zookeeper state update: :connected:none
315  [main] INFO  com.netflix.curator.framework.imps.CuratorFrameworkImpl  - Starting
360  [main] INFO  com.netflix.curator.framework.imps.CuratorFrameworkImpl  - Starting
368  [main-EventThread] INFO  backtype.storm.zookeeper  - Zookeeper state update: :connected:none
373  [main] INFO  com.netflix.curator.framework.imps.CuratorFrameworkImpl  - Starting
375  [main] INFO  com.netflix.curator.framework.imps.CuratorFrameworkImpl  - Starting
379  [main-EventThread] INFO  backtype.storm.zookeeper  - Zookeeper state update: :connected:none
382  [main] INFO  com.netflix.curator.framework.imps.CuratorFrameworkImpl  - Starting
404  [main] INFO  backtype.storm.daemon.supervisor  - Starting Supervisor with conf {"dev.zookeeper.path" "/tmp/dev-storm-zookeeper", "topology.fall.back.on.java.serialization" true, "zmq.linger.millis" 0, "topology.skip.missing.kryo.registrations" true, "ui.childopts" "-Xmx768m", "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true, "nimbus.monitor.freq.secs" 10, "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.local.dir" "/tmp/c0e92e8e-640f-42f5-9cc7-221d53c4f203", "supervisor.worker.start.timeout.secs" 120, "nimbus.cleanup.inbox.freq.secs" 600, "nimbus.inbox.jar.expiration.secs" 3600, "nimbus.host" "localhost", "storm.zookeeper.port" 2000, "transactional.zookeeper.port" nil, "transactional.zookeeper.servers" nil, "storm.zookeeper.root" "/storm", "supervisor.enable" true, "storm.zookeeper.servers" ["localhost"], "transactional.zookeeper.root" "/transactional", "topology.worker.childopts" nil, "worker.childopts" "-Xmx768m", "supervisor.heartbeat.frequency.secs" 5, "drpc.port" 3772, "supervisor.monitor.frequency.secs" 3, "task.heartbeat.frequency.secs" 3, "topology.max.spout.pending" nil, "storm.zookeeper.retry.interval" 1000, "supervisor.slots.ports" (1 2 3), "topology.debug" false, "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60, "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10, "topology.workers" 1, "supervisor.childopts" "-Xmx1024m", "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05, "worker.heartbeat.frequency.secs" 1, "nimbus.task.timeout.secs" 30, "drpc.invocations.port" 3773, "zmq.threads" 1, "storm.zookeeper.retry.times" 5, "topology.state.synchronization.timeout.secs" 60, "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs" 600, "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false, "ui.port" 8080, "nimbus.childopts" "-Xmx1024m", "topology.ackers" 1, "storm.cluster.mode" "local", "topology.optimize" true, "topology.max.task.parallelism" nil}
419  [main] INFO  com.netflix.curator.framework.imps.CuratorFrameworkImpl  - Starting
422  [main-EventThread] INFO  backtype.storm.zookeeper  - Zookeeper state update: :connected:none
434  [main] INFO  com.netflix.curator.framework.imps.CuratorFrameworkImpl  - Starting
471  [main] INFO  backtype.storm.daemon.supervisor  - Starting supervisor with id 169977fe-d7db-40e1-b6f1-2d6f4501ad82 at host ob1065046.nydc1.outbrain.com
474  [main] INFO  backtype.storm.daemon.supervisor  - Starting Supervisor with conf {"dev.zookeeper.path" "/tmp/dev-storm-zookeeper", "topology.fall.back.on.java.serialization" true, "zmq.linger.millis" 0, "topology.skip.missing.kryo.registrations" true, "ui.childopts" "-Xmx768m", "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true, "nimbus.monitor.freq.secs" 10, "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.local.dir" "/tmp/b795d26b-7726-42c8-8a59-67455e1098b1", "supervisor.worker.start.timeout.secs" 120, "nimbus.cleanup.inbox.freq.secs" 600, "nimbus.inbox.jar.expiration.secs" 3600, "nimbus.host" "localhost", "storm.zookeeper.port" 2000, "transactional.zookeeper.port" nil, "transactional.zookeeper.servers" nil, "storm.zookeeper.root" "/storm", "supervisor.enable" true, "storm.zookeeper.servers" ["localhost"], "transactional.zookeeper.root" "/transactional", "topology.worker.childopts" nil, "worker.childopts" "-Xmx768m", "supervisor.heartbeat.frequency.secs" 5, "drpc.port" 3772, "supervisor.monitor.frequency.secs" 3, "task.heartbeat.frequency.secs" 3, "topology.max.spout.pending" nil, "storm.zookeeper.retry.interval" 1000, "supervisor.slots.ports" (4 5 6), "topology.debug" false, "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60, "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10, "topology.workers" 1, "supervisor.childopts" "-Xmx1024m", "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05, "worker.heartbeat.frequency.secs" 1, "nimbus.task.timeout.secs" 30, "drpc.invocations.port" 3773, "zmq.threads" 1, "storm.zookeeper.retry.times" 5, "topology.state.synchronization.timeout.secs" 60, "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs" 600, "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false, "ui.port" 8080, "nimbus.childopts" "-Xmx1024m", "topology.ackers" 1, "storm.cluster.mode" "local", "topology.optimize" true, "topology.max.task.parallelism" nil}
476  [main] INFO  com.netflix.curator.framework.imps.CuratorFrameworkImpl  - Starting
479  [main-EventThread] INFO  backtype.storm.zookeeper  - Zookeeper state update: :connected:none
482  [main] INFO  com.netflix.curator.framework.imps.CuratorFrameworkImpl  - Starting
501  [main] INFO  backtype.storm.daemon.supervisor  - Starting supervisor with id 9700cb2f-3e8e-40f9-88f5-cadb8c65a4d5 at host ob1065046.nydc1.outbrain.com
555  [main] INFO  backtype.storm.daemon.nimbus  - Received topology submission for test with conf {"topology.ackers" 1, "topology.kryo.register" nil, "topology.name" "test", "storm.id" "test-1-1340315323", "topology.debug" true}
684  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:1 timed out
686  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:2 timed out
687  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:3 timed out
689  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:4 timed out
690  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:5 timed out
691  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:6 timed out
692  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:7 timed out
693  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:8 timed out
694  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:9 timed out
696  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:10 timed out
697  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:11 timed out
698  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:12 timed out
699  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:13 timed out
700  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:14 timed out
702  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:15 timed out
703  [main] INFO  backtype.storm.daemon.nimbus  - Task test-1-1340315323:16 timed out
713  [main] INFO  backtype.storm.daemon.nimbus  - Reassigning test-1-1340315323 to 1 slots
713  [main] INFO  backtype.storm.daemon.nimbus  - Reassign ids: [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16]
714  [main] INFO  backtype.storm.daemon.nimbus  - Available slots: (["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3] ["9700cb2f-3e8e-40f9-88f5-cadb8c65a4d5" 6] ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 1] ["9700cb2f-3e8e-40f9-88f5-cadb8c65a4d5" 4] ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 2] ["9700cb2f-3e8e-40f9-88f5-cadb8c65a4d5" 5])
734  [main] INFO  backtype.storm.daemon.nimbus  - Setting new assignment for storm id test-1-1340315323: #backtype.storm.daemon.common.Assignment{:master-code-dir "/tmp/1a9a2d49-cf65-4ae3-8555-285c510ebcee/nimbus/stormdist/test-1-1340315323", :node->host {"169977fe-d7db-40e1-b6f1-2d6f4501ad82" "ob1065046.nydc1.outbrain.com"}, :task->node+port {1 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 2 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 3 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 4 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 5 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 6 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 7 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 8 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 9 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 10 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 11 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 12 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 13 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 14 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 15 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3], 16 ["169977fe-d7db-40e1-b6f1-2d6f4501ad82" 3]}, :task->start-time-secs {1 1340315324, 2 1340315324, 3 1340315324, 4 1340315324, 5 1340315324, 6 1340315324, 7 1340315324, 8 1340315324, 9 1340315324, 10 1340315324, 11 1340315324, 12 1340315324, 13 1340315324, 14 1340315324, 15 1340315324, 16 1340315324}}
748  [main] INFO  backtype.storm.daemon.nimbus  - Activating test: test-1-1340315323
1456 [Thread-5] INFO  backtype.storm.daemon.supervisor  - Downloading code for storm id test-1-1340315323 from /tmp/1a9a2d49-cf65-4ae3-8555-285c510ebcee/nimbus/stormdist/test-1-1340315323
1499 [Thread-8] INFO  backtype.storm.daemon.supervisor  - Downloading code for storm id test-1-1340315323 from /tmp/1a9a2d49-cf65-4ae3-8555-285c510ebcee/nimbus/stormdist/test-1-1340315323
1704 [Thread-5] INFO  backtype.storm.daemon.supervisor  - Extracting resources from jar at /outbrain/storm-starter/target/storm-starter-0.0.1-SNAPSHOT-jar-with-dependencies.jar to /tmp/c0e92e8e-640f-42f5-9cc7-221d53c4f203/supervisor/stormdist/test-1-1340315323/resources
1710 [Thread-8] INFO  backtype.storm.daemon.supervisor  - Extracting resources from jar at /outbrain/storm-starter/target/storm-starter-0.0.1-SNAPSHOT-jar-with-dependencies.jar to /tmp/b795d26b-7726-42c8-8a59-67455e1098b1/supervisor/stormdist/test-1-1340315323/resources
1717 [Thread-5] INFO  backtype.storm.daemon.supervisor  - Finished downloading code for storm id test-1-1340315323 from /tmp/1a9a2d49-cf65-4ae3-8555-285c510ebcee/nimbus/stormdist/test-1-1340315323
1717 [Thread-8] INFO  backtype.storm.daemon.supervisor  - Finished downloading code for storm id test-1-1340315323 from /tmp/1a9a2d49-cf65-4ae3-8555-285c510ebcee/nimbus/stormdist/test-1-1340315323
1730 [Thread-6] ERROR backtype.storm.event  - Error when processing event
java.lang.RuntimeException: java.lang.ClassNotFoundException: clojure.core$concat$fn__106
at backtype.storm.utils.Utils.deserialize(Utils.java:61)
at backtype.storm.utils.LocalState.snapshot(LocalState.java:24)
at backtype.storm.utils.LocalState.get(LocalState.java:28)
at backtype.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:189)
at clojure.lang.AFn.applyToHelper(AFn.java:161)
at clojure.lang.AFn.applyTo(AFn.java:151)
at clojure.core$apply.invoke(core.clj:603)
at clojure.core$partial$fn__444.doInvoke(core.clj:2343)
at clojure.lang.RestFn.invoke(RestFn.java:397)
at backtype.storm.event$event_manager$fn__7769.invoke(event.clj:24)
1738 [Thread-6] INFO  backtype.storm.util  - Halting process: ("Error when processing an event")

Nathan Marz

unread,
Jun 21, 2012, 7:05:53 PM6/21/12
to storm...@googlegroups.com
There's a pull request open that fixes this: https://github.com/nathanmarz/storm/pull/246

You can try applying that and building your own release to confirm that it fixes the problem.
--
Twitter: @nathanmarz
http://nathanmarz.com

Nathan Milford

unread,
Jun 21, 2012, 9:29:42 PM6/21/12
to storm...@googlegroups.com
That did it!

Thanks Nathan :)

If you're at Velocity I will buy you a beer or beverage or something.

Or, I'll have my secret friends at twitter place random presents on your desk before you get in in the morning :P

- n
Reply all
Reply to author
Forward
0 new messages