spark-shell on cluster connects, then exits right away

2,582 views
Skip to first unread message

Evan Chan

unread,
Apr 2, 2013, 6:26:33 PM4/2/13
to spark...@googlegroups.com, Krzysztof Wilczynski
We've spent the better part of today trying to get our cluster spark-shell to actually connect to the cluster, with no luck.   Any debugging hints would be appreciated. 

Spark 0.7.0
Running in standalone mode on Ubuntu Linux.
Currently just single node, both master and slave on the same box.

ev@u9-r1:~$ MASTER="spark://u9-r1.mtv:7077" spark-shell
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 0.7.0
      /_/

Using Scala version 2.9.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_26)
Initializing interpreter...
Creating SparkContext...
2013-04-02 21:03:10 INFO  [spark-akka.actor.default-dispatcher-2]: Slf4jEventHandler:61 - Slf4jEventHandler started
2013-04-02 21:03:10 INFO  [Thread-11]: BlockManagerMaster:31 - Registered BlockManagerMaster Actor
2013-04-02 21:03:10 INFO  [Thread-11]: MemoryStore:31 - MemoryStore started with capacity 323.9 MB.
2013-04-02 21:03:10 INFO  [Thread-11]: DiskStore:31 - Created local directory at /var/lib/spark/spark-local-20130402210310-ec47
2013-04-02 21:03:10 INFO  [Thread-11]: ConnectionManager:31 - Bound socket to port 44995 with id = ConnectionManagerId(u9-r1.mtv,44995)
2013-04-02 21:03:10 INFO  [Thread-11]: BlockManagerMaster:31 - Trying to register BlockManager
2013-04-02 21:03:10 INFO  [Thread-11]: BlockManagerMaster:31 - Registered BlockManager
2013-04-02 21:03:10 INFO  [Thread-11]: HttpBroadcast:31 - Broadcast server started at http://10.11.1.19:34296
2013-04-02 21:03:10 INFO  [Thread-11]: MapOutputTracker:31 - Registered MapOutputTrackerActor actor
2013-04-02 21:03:10 INFO  [Thread-11]: HttpFileServer:31 - HTTP File server directory is /tmp/spark-6db8c510-3184-4ae4-b492-9ecb062ff874
2013-04-02 21:03:10 INFO  [spark-akka.actor.default-dispatcher-3]: IoWorker:55 - IoWorker thread 'spray-io-worker-0' started
2013-04-02 21:03:10 INFO  [spark-akka.actor.default-dispatcher-2]: HttpServer:55 - akka://spark/user/BlockManagerHTTPServer started on /0.0.0.0:39861
2013-04-02 21:03:10 INFO  [Thread-11]: BlockManagerUI:31 - Started BlockManager web UI at http://u9-r1.mtv:39861
2013-04-02 21:03:10 INFO  [spark-akka.actor.default-dispatcher-3]: Client$ClientActor:31 - Connecting to master spark://u9-r1.mtv:7077
Spark context available as sc.
2013-04-02 21:03:10 INFO  [spark-akka.actor.default-dispatcher-3]: SparkDeploySchedulerBackend:31 - Connected to Spark cluster with app ID app-20130402210310-0007
2013-04-02 21:03:10 INFO  [spark-akka.actor.default-dispatcher-3]: Client$ClientActor:31 - Executor added: app-20130402210310-0007/0 on worker-20130402202223-u9-r1.mtv-49363 (u9-r1.mtv) with 8 cores
2013-04-02 21:03:10 INFO  [spark-akka.actor.default-dispatcher-3]: SparkDeploySchedulerBackend:31 - Granted executor ID app-20130402210310-0007/0 on host u9-r1.mtv with 8 cores, 512.0 MB RAM
2013-04-02 21:03:11 ERROR [spark-akka.actor.default-dispatcher-1]: Client$ClientActor:47 - Connection to master failed; stopping client
2013-04-02 21:03:11 ERROR [spark-akka.actor.default-dispatcher-1]: SparkDeploySchedulerBackend:47 - Disconnected from Spark cluster!
2013-04-02 21:03:11 ERROR [spark-akka.actor.default-dispatcher-1]: ClusterScheduler:47 - Exiting due to error from cluster scheduler: Disconnected from Spark cluster

Here are the relevant master logs:

2013-04-02 21:03:10 INFO  [sparkMaster-akka.actor.default-dispatcher-77]: Master:31 - Registering app Spark shell
2013-04-02 21:03:10 INFO  [sparkMaster-akka.actor.default-dispatcher-77]: Master:31 - Registered app Spark shell with ID app-20130402210310-0007
2013-04-02 21:03:10 INFO  [sparkMaster-akka.actor.default-dispatcher-77]: Master:31 - Launching executor app-20130402210310-0007/0 on worker worker-20130402202223-u9-r1.mtv-49363
2013-04-02 21:03:11 INFO  [sparkMaster-akka.actor.default-dispatcher-77]: Master:31 - Removing app app-20130402210310-0007
2013-04-02 21:03:11 WARN  [sparkMaster-akka.actor.default-dispatcher-77]: Master:43 - Got status update for unknown executor app-20130402210310-0007/0

As you can see, the app registers with the Master, but is removed immediately.  This can be seen in the UI as well, which shows "Spark shell" app running then exiting right away (0 seconds).

The slave logs shows something similar:

2013-04-02 21:03:10 INFO  [sparkWorker-akka.actor.default-dispatcher-6]: Worker:31 - Asked to launch executor app-20130402210310-0007/0 for Spark shell
2013-04-02 21:03:11 INFO  [sparkWorker-akka.actor.default-dispatcher-6]: Worker:31 - Asked to kill executor app-20130402210310-0007/0
2013-04-02 21:03:11 INFO  [sparkWorker-akka.actor.default-dispatcher-6]: ExecutorRunner:31 - Killing process!
2013-04-02 21:03:11 INFO  [ExecutorRunner for app-20130402210310-0007/0]: ExecutorRunner:31 - Runner thread for executor app-20130402210310-0007/0 interrupted
2013-04-02 21:03:11 INFO  [sparkWorker-akka.actor.default-dispatcher-6]: Worker:31 - Executor app-20130402210310-0007/0 finished with state KILLED
2013-04-02 21:03:12 INFO  [redirect output to /var/lib/spark/app-20130402210310-0007/0/stdout]: ExecutorRunner:31 - Redirection to /var/lib/spark/app-20130402210310-0007/0/stdout closed: Stream closed

Any help appreciated, thanks!
-Evan

Evan Chan

unread,
Apr 2, 2013, 6:31:04 PM4/2/13
to spark...@googlegroups.com, Krzysztof Wilczynski
Oh, if we modify the spark-shell script to not use LAUNCH_AS_SCALA=1 then it works, but after you exit, the shell gets messed up and you have to use reset.

Patrick Wendell

unread,
Apr 2, 2013, 7:04:40 PM4/2/13
to spark...@googlegroups.com, Krzysztof Wilczynski
Evan - do you have any inclination why this is different when you launch with scala vs java? Or you just found that it happened to work in that case.

- Patrick


--
You received this message because you are subscribed to the Google Groups "Spark Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-users...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Patrick Wendell

unread,
Apr 2, 2013, 7:07:03 PM4/2/13
to spark...@googlegroups.com, Krzysztof Wilczynski
One thing is - make sure that all the ports are open (in both directions) between the master and the client, as well as between the master and the slaves... it could be that an akka connection is timing out and that's why it's shutting down the executor. 

This might show up if you enable akka logging:

Modify
spark/conf/spark-env.sh and change the following line:

old:
SPARK_JAVA_OPTS+=" -Dspark.local.dir=/mnt/spark,/mnt2/spark"
new:
SPARK_JAVA_OPTS+=" -Dspark.local.dir=/mnt/spark,/mnt2/spark
-Dspark.akka.logLifecycleEvents=true"

Ideally do this on the master, slaves, and at the client.

Evan Chan

unread,
Apr 2, 2013, 7:15:24 PM4/2/13
to spark...@googlegroups.com, Krzysztof Wilczynski
Also, I'm able to connect remotely, however every minute there is a connection failure and it has to reconnect:

scala> 13/04/02 16:00:55 INFO cluster.SparkDeploySchedulerBackend: Executor 1 disconnected, so removing it
13/04/02 16:00:55 ERROR cluster.ClusterScheduler: Lost an executor 1 (already removed): remote Akka client shutdown
13/04/02 16:00:55 INFO client.Client$ClientActor: Executor updated: app-20130402225934-0003/1 is now FAILED (Command exited with code 1)
13/04/02 16:00:55 INFO cluster.SparkDeploySchedulerBackend: Executor app-20130402225934-0003/1 removed: Command exited with code 1
13/04/02 16:00:55 INFO client.Client$ClientActor: Executor added: app-20130402225934-0003/2 on worker-20130402225332-u9-r1.mtv-53693 (u9-r1.mtv) with 8 cores
13/04/02 16:00:55 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20130402225934-0003/2 on host u9-r1.mtv with 8 cores, 512.0 MB RAM
13/04/02 16:00:55 INFO client.Client$ClientActor: Executor updated: app-20130402225934-0003/2 is now RUNNING
13/04/02 16:00:56 INFO cluster.SparkDeploySchedulerBackend: Registered executor: Actor[akka://sparkE...@u9-r1.mtv:48641/user/Executor] with ID 2

Does that make any sense?  

2013-04-02 22:59:34 INFO  [sparkMaster-akka.actor.default-dispatcher-18]: Master:31 - Launching executor app-20130402225934-0003/0 on worker worker-20130402225332-u9-r1.mtv-53693
2013-04-02 23:00:14 INFO  [sparkMaster-akka.actor.default-dispatcher-20]: Master:31 - Removing executor app-20130402225934-0003/0 because it is FAILED

I discovered there are logs for every executor, and I see executor logs like this:

2013-04-02 23:00:11 WARN  [sparkExecutor-akka.actor.default-dispatcher-1]: BlockManagerMaster:64 - Error sending message to BlockManagerMaster in 3 attempts
java.util.concurrent.TimeoutException: Futures timed out after [10000] milliseconds
at akka.dispatch.DefaultPromise.ready(Future.scala:870)
at akka.dispatch.DefaultPromise.result(Future.scala:874)
at akka.dispatch.Await$.result(Future.scala:74)
at spark.storage.BlockManagerMaster.askDriverWithReply(BlockManagerMaster.scala:154)
at spark.storage.BlockManagerMaster.tell(BlockManagerMaster.scala:133)
at spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:64)
at spark.storage.BlockManager.initialize(BlockManager.scala:123)
at spark.storage.BlockManager.<init>(BlockManager.scala:108)
at spark.storage.BlockManager.<init>(BlockManager.scala:115)
at spark.SparkEnv$.createFromSystemProperties(SparkEnv.scala:91)
at spark.executor.Executor.initialize(Executor.scala:72)
at spark.executor.StandaloneExecutorBackend$$anonfun$receive$1.apply(StandaloneExecutorBackend.scala:39)
at spark.executor.StandaloneExecutorBackend$$anonfun$receive$1.apply(StandaloneExecutorBackend.scala:36)
at akka.actor.Actor$class.apply(Actor.scala:318)
at spark.executor.StandaloneExecutorBackend.apply(StandaloneExecutorBackend.scala:16)
at akka.actor.ActorCell.invoke(ActorCell.scala:626)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:197)
at akka.dispatch.Mailbox.run(Mailbox.scala:179)
at akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:516)
at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259)
at akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)
at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)
at akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
2013-04-02 23:00:14 INFO  [sparkExecutor-akka.actor.default-dispatcher-3]: StandaloneExecutorBackend:31 - Connecting to driver: akka://sp...@172.16.101.44:63935/user/StandaloneScheduler
2013-04-02 23:00:14 ERROR [sparkExecutor-akka.actor.default-dispatcher-2]: StandaloneExecutorBackend:47 - Error sending message to BlockManagerMaster [message = RegisterBlockManager(BlockManagerId(0, u9-r1.mtv, 50205),339585269,Actor[akka://spark/user/BlockManagerActor1])]
spark.SparkException: Error sending message to BlockManagerMaster [message = RegisterBlockManager(BlockManagerId(0, u9-r1.mtv, 50205),339585269,Actor[akka://spark/user/BlockManagerActor1])]
at spark.storage.BlockManagerMaster.askDriverWithReply(BlockManagerMaster.scala:168)
at spark.storage.BlockManagerMaster.tell(BlockManagerMaster.scala:133)
at spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:64)
at spark.storage.BlockManager.initialize(BlockManager.scala:123)
at spark.storage.BlockManager.<init>(BlockManager.scala:108)
at spark.storage.BlockManager.<init>(BlockManager.scala:115)
at spark.SparkEnv$.createFromSystemProperties(SparkEnv.scala:91)
at spark.executor.Executor.initialize(Executor.scala:72)
at spark.executor.StandaloneExecutorBackend$$anonfun$receive$1.apply(StandaloneExecutorBackend.scala:39)
at spark.executor.StandaloneExecutorBackend$$anonfun$receive$1.apply(StandaloneExecutorBackend.scala:36)
at akka.actor.Actor$class.apply(Actor.scala:318)
at spark.executor.StandaloneExecutorBackend.apply(StandaloneExecutorBackend.scala:16)
at akka.actor.ActorCell.invoke(ActorCell.scala:626)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:197)
at akka.dispatch.Mailbox.run(Mailbox.scala:179)
at akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:516)
at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259)
at akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)
at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)
at akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000] milliseconds
at akka.dispatch.DefaultPromise.ready(Future.scala:870)
at akka.dispatch.DefaultPromise.result(Future.scala:874)
at akka.dispatch.Await$.result(Future.scala:74)
at spark.storage.BlockManagerMaster.askDriverWithReply(BlockManagerMaster.scala:154)
... 19 more
2013-04-02 23:00:14 ERROR [sparkExecutor-akka.actor.default-dispatcher-3]: StandaloneExecutorBackend:47 - Slave registration failed: Duplicate executor ID: 0


I will also try instrumenting as the other poster suggested,

thanks,
Evan

Evan Chan

unread,
Apr 2, 2013, 7:24:46 PM4/2/13
to spark...@googlegroups.com, Krzysztof Wilczynski
Patrick,

So after your suggested change, I was able to get this error when the spark-shell exited:

2013-04-02 23:22:18 ERROR [spark-akka.actor.default-dispatcher-1]: SparkDeploySchedulerBackend:47 - Disconnected from Spark cluster!
2013-04-02 23:22:18 ERROR [spark-akka.actor.default-dispatcher-1]: ClusterScheduler:47 - Exiting due to error from cluster scheduler: Disconnected from Spark cluster
2013-04-02 23:22:18 ERROR [spark-akka.actor.default-dispatcher-4]: ActorSystemImpl:46 - RemoteClientError@akka://spark...@u9-r1.mtv:7077: Error[java.io.InvalidClassException:scala.collection.mutable.WrappedArray$ofRef; local class incompatible: stream classdesc serialVersionUID = 8184381945838716286, local class serialVersionUID = 6238838760334617323
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:562)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1582)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1495)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1731)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
at akka.serialization.JavaSerializer$$anonfun$1.apply(Serializer.scala:121)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at akka.serialization.JavaSerializer.fromBinary(Serializer.scala:121)
at akka.serialization.Serialization.deserialize(Serialization.scala:73)
at akka.remote.MessageSerializer$.deserialize(MessageSerializer.scala:22)
at akka.remote.RemoteMessage.payload(RemoteTransport.scala:212)
at akka.remote.RemoteMarshallingOps$class.receiveMessage(RemoteTransport.scala:283)
at akka.remote.netty.NettyRemoteTransport.receiveMessage(NettyRemoteSupport.scala:46)
at akka.remote.netty.ActiveRemoteClientHandler.messageReceived(Client.scala:281)
at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:95)
at org.jboss.netty.handler.timeout.IdleStateAwareChannelHandler.handleUpstream(IdleStateAwareChannelHandler.java:43)
at org.jboss.netty.channel.StaticChannelPipeline.sendUpstream(StaticChannelPipeline.java:372)
at org.jboss.netty.channel.StaticChannelPipeline$StaticChannelHandlerContext.sendUpstream(StaticChannelPipeline.java:534)
at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:45)
at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:69)
at org.jboss.netty.handler.execution.OrderedMemoryAwareThreadPoolExecutor$ChildExecutor.run(OrderedMemoryAwareThreadPoolExecutor.java:315)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
]

It's as though there is something messed up in the Scala environment.   In this case though, the remote and local are all running on the same host!

-Evan

Patrick Wendell

unread,
Apr 2, 2013, 7:31:40 PM4/2/13
to spark...@googlegroups.com, Krzysztof Wilczynski
Do you have scala installed in multiple locations on that machine? The issue is probably that the worker and driver process are using either different versions of scala, or versions that were compiled at different times.

- Patrick

Patrick Wendell

unread,
Apr 2, 2013, 7:44:23 PM4/2/13
to spark...@googlegroups.com, Krzysztof Wilczynski
You can actually just look directly at which scala files the process has open to determine what version it's using and make sure they are both using the same scala installation.

[root@ip-10-171-2-252 ~]# jps
2336 SecondaryNameNode
2049 SecondaryNameNode
2886 JobTracker
5313 Jps
1850 NameNode
2179 NameNode
2650 Master
5169 SharkCliDriver

[root@ip-10-171-2-252 ~]# lsof -p 5169 |grep REG |grep scala
...
java    5169 root  156r   REG              202,1  8857794  22214 /root/scala-2.9.2/lib/scala-library.jar
java    5169 root  157r   REG              202,1 11449543  22215 /root/scala-2.9.2/lib/scala-compiler.jar
java    5169 root  158r   REG              202,1   158705  22212 /root/scala-2.9.2/lib/jline.jar

Evan Chan

unread,
Apr 3, 2013, 2:40:05 AM4/3/13
to spark...@googlegroups.com, Krzysztof Wilczynski
Patrick,

When I use lsof to look at the spark master and slave processes, they don't seem to have the scala library open... .though I found it through ls -l /proc/<pid>/fd. 
However, it seems that both Spark Worker, as well as spark-shell, are using the same version of Scala (2.9.2).

You sure you have that +Dspark.akka lifecycle events option correct right?   I'm not sure I see any extra logging output.

-Evan

You received this message because you are subscribed to a topic in the Google Groups "Spark Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/spark-users/R8WzblBojJw/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to spark-users...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
--
Evan Chan
Senior Software Engineer | 
e...@ooyala.com | (650) 996-4600
www.ooyala.com | blog | @ooyala

Patrick Wendell

unread,
Apr 3, 2013, 2:59:36 AM4/3/13
to spark...@googlegroups.com, Krzysztof Wilczynski
For the master and worker, you'll need to add the option to:

SPARK_DAEMON_JAVA_OPTS

as well. Then extra akka logs should start appearing in the master log file.

In any case your problem really seems caused by this serialization version mismatch. 

- Patrick
>>>> at akka.serialization.JavaSerializer$anonfun$1.apply(Serializer.scala:121)

Evan Chan

unread,
Apr 3, 2013, 1:26:40 PM4/3/13
to spark...@googlegroups.com, Krzysztof Wilczynski
Thanks for everyone's help.  It turns out the issue is due to DNS issues on our end.   

What apparently happens is that the DNS issues causes hostnames to be resolved differently, and so the Akka threads end up binding to different IP addresses.

-Evan
--
Evan Chan
Senior Software Engineer | 

Andy Konwinski

unread,
Apr 3, 2013, 5:04:57 PM4/3/13
to spark...@googlegroups.com

Really glad to hear you resolved the issue. Thanks for reporting back what the problem was so it's documents for others.

Andy

Message has been deleted

Evan Chan

unread,
Apr 4, 2013, 7:12:56 PM4/4/13
to spark...@googlegroups.com
There are also a couple worker exceptions, around the same time:

2013-04-04 22:59:37 ERROR [sparkWorker-akka.actor.default-dispatcher-8]: Worker:47 - key not found: app-20130404225937-0007/3
java.util.NoSuchElementException: key not found: app-20130404225937-0007/3
        at scala.collection.MapLike$class.default(MapLike.scala:225)
        at scala.collection.mutable.HashMap.default(HashMap.scala:45)
        at scala.collection.MapLike$class.apply(MapLike.scala:135)
        at scala.collection.mutable.HashMap.apply(HashMap.scala:45)
        at spark.deploy.worker.Worker$$anonfun$receive$1.apply(Worker.scala:126)
        at spark.deploy.worker.Worker$$anonfun$receive$1.apply(Worker.scala:100)
        at akka.actor.Actor$class.apply(Actor.scala:318)
        at spark.deploy.worker.Worker.apply(Worker.scala:18)
        at akka.actor.ActorCell.invoke(ActorCell.scala:626)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:197)
        at akka.dispatch.Mailbox.run(Mailbox.scala:179)
        at akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:516)
        at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259)
        at akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)
        at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)
        at akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
2013-04-04 22:59:37 INFO  [sparkWorker-akka.actor.default-dispatcher-9]: Worker:31 - Starting Spark worker 10.11.1.19:51057 with 8 cores, 8.0 GB RAM
2013-04-04 22:59:37 INFO  [sparkWorker-akka.actor.default-dispatcher-9]: Worker:31 - Spark home: /usr/lib/spark
2013-04-04 22:59:37 INFO  [sparkWorker-akka.actor.default-dispatcher-9]: Worker:31 - Connecting to master spark://10.11.1.19:7077
2013-04-04 22:59:37 INFO  [sparkWorker-akka.actor.default-dispatcher-10]: IoWorker:55 - IoWorker thread 'spray-io-worker-1' started
2013-04-04 22:59:37 ERROR [sparkWorker-akka.actor.default-dispatcher-9]: Worker:68 - Failed to create web UI
akka.actor.InvalidActorNameException:actor name HttpServer is not unique!
[52eedbe0-9d7b-11e2-a156-003048c63b0c]
        at akka.actor.ActorCell.actorOf(ActorCell.scala:392)
        at akka.actor.LocalActorRefProvider$Guardian$$anonfun$receive$1.liftedTree1$1(ActorRefProvider.scala:394)
        at akka.actor.LocalActorRefProvider$Guardian$$anonfun$receive$1.apply(ActorRefProvider.scala:394)
        at akka.actor.LocalActorRefProvider$Guardian$$anonfun$receive$1.apply(ActorRefProvider.scala:392)
        at akka.actor.Actor$class.apply(Actor.scala:318)
        at akka.actor.LocalActorRefProvider$Guardian.apply(ActorRefProvider.scala:388)

Evan Chan

unread,
Apr 5, 2013, 12:52:08 PM4/5/13
to spark...@googlegroups.com
OK, guys, I think we finally have a handle on the real problem.

It turns out the problem was that we were including a custom inputformat in SPARK_CLASSPATH, and the inputformat jar was assembled as a fat jar and included the scala library (also 2.9.2).   For some reason, there is some mismatch between the inputformat jar scala library and the version that we installed on the system.   So I assembled a version of the inputformat without scala-library.jar, and now spark-shell starts fine.   :-p

thanks,
Evan
Reply all
Reply to author
Forward
0 new messages