spark worker registration failed

89 views
Skip to first unread message

moji...@gmail.com

unread,
Aug 18, 2013, 10:24:35 PM8/18/13
to spark-de...@googlegroups.com
hi, everyone
   
    when i do some operation of shark, and  i find that spark running for sometime, some worker of spark will failed, the error log is:

        13/08/16 20:37:36 INFO worker.Worker: Asked to launch executor app-20130816203617-0356/27 for Shark::xxxxx 13/08/16 20:37:36 ERROR worker.Worker: key not found: app-20130816203617-0356/26
java.util.NoSuchElementException: key not found: app-20130816203617-0356/26
        at scala.collection.MapLike$class.default(MapLike.scala:225)
        at scala.collection.mutable.HashMap.default(HashMap.scala:45)
        at scala.collection.MapLike$class.apply(MapLike.scala:135)
        at scala.collection.mutable.HashMap.apply(HashMap.scala:45)
        at spark.deploy.worker.Worker$$anonfun$receive$1.apply(Worker.scala:145)
        at spark.deploy.worker.Worker$$anonfun$receive$1.apply(Worker.scala:119)
        at akka.actor.Actor$class.apply(Actor.scala:318)
        at spark.deploy.worker.Worker.apply(Worker.scala:39)
        at akka.actor.ActorCell.invoke(ActorCell.scala:626)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:197)
        at akka.dispatch.Mailbox.run(Mailbox.scala:179)
        at akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:516)
        at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259)
        at akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)
        at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)
        at akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
13/08/16 20:37:36 INFO worker.ExecutorRunner: Runner thread for executor app-20130816203617-0356/27 interrupted
13/08/16 20:37:36 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{*,null}
13/08/16 20:37:36 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/json,null}
13/08/16 20:37:36 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/logPage,null}
13/08/16 20:37:36 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/log,null}
13/08/16 20:37:36 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/static,null}
13/08/16 20:37:36 INFO worker.Worker: Starting Spark worker h69-246:44781 with 32 cores, 4.0 GB RAM
13/08/16 20:37:36 INFO worker.Worker: Spark home: /home/hadoop/spark/spark
13/08/16 20:37:36 INFO server.Server: jetty-7.6.8.v20121106
13/08/16 20:37:36 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/static,null}
13/08/16 20:37:36 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/log,null}
13/08/16 20:37:36 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/logPage,null}
13/08/16 20:37:36 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/json,null}
13/08/16 20:37:36 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{*,null}
13/08/16 20:37:36 INFO server.AbstractConnector: Started SelectChann...@0.0.0.0:8081
13/08/16 20:37:36 INFO ui.WorkerWebUI: Started Worker web UI at http://h69-246:8081
13/08/16 20:37:36 INFO worker.Worker: Connecting to master spark://master:7077
13/08/16 20:37:36 ERROR worker.Worker: Worker registration failed: Duplicate worker ID
13/08/16 20:37:36 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{*,null}
13/08/16 20:37:36 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/json,null}

     and  i restart these worker, it can work again. Did anybody knows the reason?
Reply all
Reply to author
Forward
0 new messages