Tachyon 0.5.0 - Standalone (Works Fine) vs Cluster (Some Issue)

17 views
Skip to first unread message

Naga Vijay

unread,
Feb 15, 2016, 7:30:35 PM2/15/16
to Tachyon Users
Hello,

Unfortunately I am still stuck with Tachyon 0.5.0 on one of my projects.

Tachyon 0.5.0 Standalone is working fine in talking to this particular version of CDH/Hadoop (Hadoop 2.6.0-cdh5.4.2) ... the master & worker startup fine on the same box ...

hadoop version
Hadoop 2.6.0-cdh5.4.2
Subversion http://github.com/cloudera/hadoop -r 15b703c8725733b7b2813d2325659eb7d57e7a3f
Compiled by jenkins on 2015-05-20T00:03Z
Compiled with protoc 2.5.0
From source with checksum de74f1adb3744f8ee85d9a5b98f90d
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.4.2.jar

But when I tried to setup Tachyon 0.5.0 as a very simple cluster with just 1 worker on a different box, the worker fails to start, and throws this message in the worker log ...

> cat worke...@10.254.7.192_02-15-2016
2016-02-15 23:59:13,661 INFO  WORKER_LOGGER (MasterClient.java:worker_register) - Registered at the master ip-10-254-7-145.ec2.internal/10.254.7.145:19998 from worker NetAddress(mHost:ip-10-254-7-192.ec2.internal, mPort:29998) , got WorkerId 1455580000001
2016-02-15 23:59:13,935 ERROR  (CommonUtils.java:runtimeException) - Server IPC version 9 cannot communicate with client version 4
org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4
    at org.apache.hadoop.ipc.Client.call(Client.java:1070)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
    at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
    at tachyon.UnderFileSystemHdfs.<init>(UnderFileSystemHdfs.java:89)
    at tachyon.UnderFileSystemHdfs.getClient(UnderFileSystemHdfs.java:56)
    at tachyon.UnderFileSystem.get(UnderFileSystem.java:69)
    at tachyon.UnderFileSystem.get(UnderFileSystem.java:54)
    at tachyon.worker.WorkerStorage.<init>(WorkerStorage.java:333)
    at tachyon.worker.TachyonWorker.<init>(TachyonWorker.java:191)
    at tachyon.worker.TachyonWorker.createWorker(TachyonWorker.java:99)
    at tachyon.worker.TachyonWorker.main(TachyonWorker.java:138)

~~~~~

Any idea what could be happening?  Here's the minimal diff in conf ... both on the master and on the slave ...

~~~~~

> diff tachyon-env.sh.template tachyon-env.sh
25c25
<     export JAVA_HOME=/usr/lib/jvm/java-7-oracle
---
>     export JAVA_HOME=/usr/java/jdk1.8.0_71
31,32c31,32
< export TACHYON_MASTER_ADDRESS=localhost
< export TACHYON_UNDERFS_ADDRESS=$TACHYON_HOME/underfs
---
> export TACHYON_MASTER_ADDRESS=ip-10-254-7-145.ec2.internal
> export TACHYON_UNDERFS_ADDRESS=hdfs://10.254.7.162:8020
34c34
< export TACHYON_WORKER_MEMORY_SIZE=1GB
---
> export TACHYON_WORKER_MEMORY_SIZE=5GB

~~~~~

Thanks very much for your help in advance.

Regards
Naga Vijayapuram

Jiří Šimša

unread,
Feb 15, 2016, 8:24:27 PM2/15/16
to Naga Vijay, Tachyon Users
Hello Naga,

Tachyon 0.5.0 is very very old. Any chance you could upgrade to Tachyon version 0.8.2 or 0.9-rc?

Best,

--
You received this message because you are subscribed to the Google Groups "Tachyon Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tachyon-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Jiří Šimša

Naga Vijay

unread,
Feb 15, 2016, 9:58:05 PM2/15/16
to Tachyon Users, naga...@gmail.com
Hello Jiri,

Unfortunately this particular project is tied to CDH 5.4.2 in which Spark 1.3.0 is bundled, and from the compatibility matrix Spark 1.3.0 works with Tachyon 0.5.0.  I will try to see how to decouple the Spark 1.3.0 from CDH 5.4.2 and move to Spark 1.6.0, as Spark 1.6.0 will go well with Tachyon 0.8.2 as mentioned in the compatibility matrix.

Regards
Naga

Gene Pang

unread,
Feb 15, 2016, 11:37:01 PM2/15/16
to Tachyon Users, naga...@gmail.com
Hi Naga,

This type of error (Server IPC version 9 cannot communicate with client version 4) typically occurs when Tachyon was compiled with a different hadoop version than what is deployed. Can you compile Tachyon with this option:

-Dhadoop.version=2.6.0-cdh5.4.2

Although, I'm not entirely sure if this works with 0.5, since it is so old.

Thanks,
Gene

Naga Vijay

unread,
Feb 16, 2016, 6:30:42 PM2/16/16
to Tachyon Users, naga...@gmail.com
Thanks very much, Gene!  Compiling Tachyon 0.5.0 with -Dhadoop.version=2.6.0-cdh5.4.2 resolved the issue.

Regards
Naga

Gene Pang

unread,
Feb 16, 2016, 6:52:38 PM2/16/16
to Tachyon Users, naga...@gmail.com
Hi Naga,

Thanks for the update. Glad I could help!

-Gene
Reply all
Reply to author
Forward
0 new messages