spark cannot connect to tachyon master

73 views
Skip to first unread message

Puja Gupta

unread,
May 4, 2015, 5:13:39 PM5/4/15
to tachyo...@googlegroups.com
Hi,

I am getting following error in spark-shell. I try to use actual ip of master instead of hostname but still same error. Tachyon master and worker are running properly, verified from logs. Any help appreciated.

15/05/04 17:07:52 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, PROCESS_LOCAL, 1397 bytes)
15/05/04 17:07:52 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
15/05/04 17:07:52 INFO executor.Executor: Running task 1.0 in stage 0.0 (TID 1)
15/05/04 17:07:52 INFO executor.Executor: Fetching http://192.168.140.254:41747/jars/spark-examples-1.3.0-hadoop2.3.0.jar with timestamp 1430773672121
15/05/04 17:07:52 INFO util.Utils: Fetching http://192.168.140.254:41747/jars/spark-examples-1.3.0-hadoop2.3.0.jar to /tmp/spark-cee960f3-c4ee-46d7-9211-1911f7d4bf59/userFiles-2e9a6489-5f8f-4479-8898-d67430ab1b2b/fetchFileTemp3897120192696288762.tmp
15/05/04 17:07:53 INFO executor.Executor: Adding file:/tmp/spark-cee960f3-c4ee-46d7-9211-1911f7d4bf59/userFiles-2e9a6489-5f8f-4479-8898-d67430ab1b2b/spark-examples-1.3.0-hadoop2.3.0.jar to class loader
15/05/04 17:07:53 INFO spark.CacheManager: Partition rdd_0_1 not found, computing it
15/05/04 17:07:53 INFO spark.CacheManager: Partition rdd_0_0 not found, computing it
15/05/04 17:07:53 INFO : Trying to connect master @ /192.168.140.254:19998
15/05/04 17:07:53 ERROR : Failed to connect (1) to master ri/192.168.140.254:19998 : java.net.ConnectException: Connection refused
15/05/04 17:07:54 ERROR : Failed to connect (2) to master ri/192.168.140.254:19998 : java.net.ConnectException: Connection refused
15/05/04 17:07:55 ERROR : Failed to connect (3) to master ri/192.168.140.254:19998 : java.net.ConnectException: Connection refused
15/05/04 17:07:56 ERROR : Failed to connect (4) to master ri/192.168.140.254:19998 : java.net.ConnectException: Connection refused
15/05/04 17:07:57 ERROR : Failed to connect (5) to master ri/192.168.140.254:19998 : java.net.ConnectException: Connection refused
15/05/04 17:07:58 WARN storage.TachyonBlockManager: Attempt 1 to create tachyon dir null failed
java.io.IOException: Failed to connect to master ri/192.168.140.254:19998 after 5 attempts
        at tachyon.client.TachyonFS.connect(TachyonFS.java:293)
        at tachyon.client.TachyonFS.getFileId(TachyonFS.java:1011)
        at tachyon.client.TachyonFS.exist(TachyonFS.java:633)
        at org.apache.spark.storage.TachyonBlockManager$$anonfun$createTachyonDirs$2.apply(TachyonBlockManager.scala:117)
        at org.apache.spark.storage.TachyonBlockManager$$anonfun$createTachyonDirs$2.apply(TachyonBlockManager.scala:106)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
 
Thanks,
Puja

Jonathan Coveney

unread,
May 4, 2015, 6:43:48 PM5/4/15
to tachyo...@googlegroups.com
are the master and worker local, or remote?

Puja Gupta

unread,
May 4, 2015, 6:48:09 PM5/4/15
to tachyo...@googlegroups.com
They are local. When starting tachyon I used tachyon-start.sh local Mount instead of SudoMount. I don't have sudo permissions as it college cluster but that should be fine I guess.

Calvin Jia

unread,
May 5, 2015, 4:42:01 PM5/5/15
to tachyo...@googlegroups.com
Hi Puja,

Are you still having this issue after resolving the network issues in the other thread?

Thanks,
Calvin

Puja Gupta

unread,
May 5, 2015, 4:50:17 PM5/5/15
to tachyo...@googlegroups.com
Hi Calvin,
No it works fine for me now :) Thank you

Best,
Puja

薛晨浩

unread,
Oct 16, 2015, 5:25:05 AM10/16/15
to Tachyon Users
I used the command "sudo bin/tachyon-start.sh all Mount", however, spark still can not connect to tachyon master. Can you give me some advice?

在 2015年5月6日星期三 UTC+8上午4:50:17,Puja Gupta写道:

Jiří Šimša

unread,
Oct 16, 2015, 9:53:03 AM10/16/15
to 薛晨浩, Tachyon Users
What Spark command are you running and what error are you seeing?

--
Jiří Šimša

--
You received this message because you are subscribed to the Google Groups "Tachyon Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tachyon-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

薛晨浩

unread,
Oct 16, 2015, 10:09:34 AM10/16/15
to Tachyon Users, hear...@gmail.com
I run program through spark-shell(cluster model). And the commands are:
val s = sc.textFile("tachyon://master:19998:/test/README.md")
s.count()
In addition, I have to use "sudo tachyon-start.sh all Mount" to start tachyon, otherwise it logs permission denied.
The error is "java.io.IOException: Failed to connect to master xxx.xxx.xxx.xxx:19998 after 29 attempts"

在 2015年10月16日星期五 UTC+8下午9:53:03,Jiří Šimša写道:

Yupeng Fu

unread,
Oct 16, 2015, 1:48:05 PM10/16/15
to 薛晨浩, Tachyon Users
Hi,

It seems your master was not up. Can you check logs/master.log, and paste the log?

Cheers,
--
--Yupeng

薛晨浩

unread,
Oct 16, 2015, 10:43:49 PM10/16/15
to Tachyon Users, hear...@gmail.com
I think my tachyon was up, because it's fine in web. The log is:
2015-10-17 10:38:32,456 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run) - Thrift error occurred during processing of message.
org.apache.thrift.TException: Service name not found in message name: user_getUserId.  Did you forget to use a TMultiplexProtocol in your client?
at org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:103)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2015-10-17 10:38:32,712 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run) - Thrift error occurred during processing of message.
org.apache.thrift.TException: Service name not found in message name: user_getUserId.  Did you forget to use a TMultiplexProtocol in your client?
at org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:103)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2015-10-17 10:38:32,760 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run) - Thrift error occurred during processing of message.
org.apache.thrift.TException: Service name not found in message name: user_getUserId.  Did you forget to use a TMultiplexProtocol in your client?
at org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:103)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2015-10-17 10:38:33,264 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run) - Thrift error occurred during processing of message.
org.apache.thrift.TException: Service name not found in message name: user_getUserId.  Did you forget to use a TMultiplexProtocol in your client?
at org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:103)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2015-10-17 10:38:33,457 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run) - Thrift error occurred during processing of message.
org.apache.thrift.TException: Service name not found in message name: user_getUserId.  Did you forget to use a TMultiplexProtocol in your client?
at org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:103)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2015-10-17 10:38:33,713 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run) - Thrift error occurred during processing of message.
org.apache.thrift.TException: Service name not found in message name: user_getUserId.  Did you forget to use a TMultiplexProtocol in your client?
at org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:103)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2015-10-17 10:38:33,761 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run) - Thrift error occurred during processing of message.
org.apache.thrift.TException: Service name not found in message name: user_getUserId.  Did you forget to use a TMultiplexProtocol in your client?
at org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:103)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


在 2015年10月17日星期六 UTC+8上午1:48:05,Yupeng Fu写道:

Gene Pang

unread,
Oct 16, 2015, 11:04:10 PM10/16/15
to Tachyon Users, hear...@gmail.com
It looks like the master and the client are running different version of Tachyon. The master is running the newer code, where as your application is using an older version of the client which is not using the TMultiplexProtocol yet. I think you might have to recompile spark to use the newer version of Tachyon?

Thanks,
Gene

薛晨浩

unread,
Oct 16, 2015, 11:48:03 PM10/16/15
to Tachyon Users, hear...@gmail.com

My tachyon version is 0.7.1 and my Spark version is 1.5.1. The document of tachyon says this version pairings will work together out-of-the-box. Do I still need to recompile spark?

在 2015年10月17日星期六 UTC+8上午11:04:10,Gene Pang写道:

Gene Pang

unread,
Oct 16, 2015, 11:52:56 PM10/16/15
to Tachyon Users, hear...@gmail.com
It should work with 0.7.1, but I think Tachyon started using TMultiplexedProcessor after version 0.7.1? What does your master web UI say the Tachyon version is?

Thanks,
Gene

Takechiyo

unread,
Oct 17, 2015, 11:04:24 PM10/17/15
to Tachyon Users, hear...@gmail.com
Thanks a lot. I solved my problem. My memory was at fault. I use tachyon version 0.7.1 and recompiled it. It works!


在 2015年10月17日星期六 UTC+8上午11:04:10,Gene Pang写道:
It looks like the master and the client are running different version of Tachyon. The master is running the newer code, where as your application is using an older version of the client which is not using the TMultiplexProtocol yet. I think you might have to recompile spark to use the newer version of Tachyon?

Thanks,
Gene

Gene Pang

unread,
Oct 22, 2015, 10:33:14 AM10/22/15
to Tachyon Users, hear...@gmail.com
Thanks for verifying!

-Gene
Reply all
Reply to author
Forward
0 new messages