Tachyon configuration on a Standalone Cluster with multiple nodes

75 views
Skip to first unread message

Nashe Chiu

unread,
Jan 25, 2016, 1:12:38 PM1/25/16
to Tachyon Users
I have the following configuration in my tachyon-env.sh

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export TACHYON_MASTER_ADDRESS=10.20.67.2
export TACHYON_WORKER_MEMORY_SIZE=4GB
export TACHYON_UNDERFS_ADDRESS=/tmp
export TACHYON_RAM_FOLDER=/usr/local/share/tachyonRAMFS

I have also specified the IP addresses of  the workers in the /conf/workers file

I'm trying to get Tachyon to work with an Apache Spark 1.6.0 Stand alone Cluster. I haven't yet linked Spark with Tachyon. I wanted to test tachyon first but I'm getting the error below when I run
./bin/tachyon format

---------------------------------------------------------------------------------------------------------------------------------------
Connecting to XX.XX.XX.XX as root...
hostname: Name or service not known
Formatting Tachyon Worker @
Exception in thread "main" java.lang.ExceptionInInitializerError
        at tachyon.Constants.<clinit>(Constants.java:328)
        at tachyon.Format.<clinit>(Format.java:32)
Caused by: java.lang.RuntimeException: java.net.UnknownHostException: cluster-01: cluster-01
        at com.google.common.base.Throwables.propagate(Throwables.java:160)
        at tachyon.util.network.NetworkAddressUtils.getLocalIpAddress(NetworkAddressUtils.java:398)
        at tachyon.util.network.NetworkAddressUtils.getLocalHostName(NetworkAddressUtils.java:320)
        at tachyon.conf.TachyonConf.<init>(TachyonConf.java:122)
        at tachyon.conf.TachyonConf.<init>(TachyonConf.java:111)
        at tachyon.Version.<clinit>(Version.java:27)
        ... 2 more
Caused by: java.net.UnknownHostException: cluster-01: cluster-01
        at java.net.InetAddress.getLocalHost(InetAddress.java:1496)
        at tachyon.util.network.NetworkAddressUtils.getLocalIpAddress(NetworkAddressUtils.java:355)
        ... 6 more
Caused by: java.net.UnknownHostException: cluster-01
        at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:922)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
        at java.net.InetAddress.getLocalHost(InetAddress.java:1492)
        ... 7 more
Connection to XX.XX.XX.XX closed.
Formatting Tachyon Master @ 10.20.67.2.
--------------------------------------------------------------------------------------------------------------------------------------------

Could some please help? I also need some detailed documentation/outline on how to configure Tachyon on a Standalone Cluster with multiple nodes.

Thanks,

Nashe

Bin Fan

unread,
Jan 25, 2016, 8:22:06 PM1/25/16
to Nashe Chiu, Tachyon Users
Hi Nashe,

What the script does is to ssh to a list of hosts specified in conf/workers.
In your case, it seems the hostname you input into your conf/workers can not be recognized by ssh.
Could you verify this by running "ssh your_worker_hostname" manually?

Best,

- Bin

--
You received this message because you are subscribed to the Google Groups "Tachyon Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tachyon-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nashe Chiu

unread,
Jan 25, 2016, 10:56:26 PM1/25/16
to Bin Fan, Tachyon Users
Bin,

Thanks for the prompt response. In conf/workers file I have

# A Tachyon Worker will be started on each of the machines listed below.
10.1.26.96
----------------------------------------------------

I have enabled passwordless ssh and when I run the command ssh ro...@10.1.26.96 it works fine.

Thanks,

Nashe

cc

unread,
Jan 26, 2016, 2:42:20 AM1/26/16
to Tachyon Users, fanb...@gmail.com
It complains that cluster-01 is an unknown hostname, can you ping cluster-01 on the node where you run ./bin/tachyon format? Also, what's your hostname on that node?

在 2016年1月26日星期二 UTC+8上午11:56:26,Nashe Chiu写道:

Nashe Chiu

unread,
Jan 27, 2016, 3:42:51 PM1/27/16
to cc, Tachyon Users, Bin Fan
Bin,

The problem was with my /etc/hosts/ I included the name and ip address of all the workers and master and its now working fine.
To get Tachyon to work with Apache Spark is there some configuration in /conf I need to specify in either Tachyon or Spark?
I don't seem to find any specific documentation on this. How do I mount/copy files to Tachyon/RAMFS so that I can access them from Spark e.g. " var file = sc.textFile("tachyon://localhost:19998/MyFile.txt")"?
For Spark to work with Tachyon which order to I start the applications. i.e. Tachyon then Spark or vice versa ?

Thanks,

Nashe

--
You received this message because you are subscribed to a topic in the Google Groups "Tachyon Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tachyon-users/KoWAJwOB7RU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tachyon-user...@googlegroups.com.

Bin Fan

unread,
Jan 27, 2016, 11:54:17 PM1/27/16
to Nashe Chiu, cc, Tachyon Users
Here is document on running Spark on Tachyon:
http://tachyon-project.org/documentation/v0.8.2/Running-Spark-on-Tachyon.html

To copy files to Tachyon, you can check copyFromLocal command, 

Also you need to have Tachyon cluster (including master and workers) already configured and running,
You can find instructions here:

- Bin

Nashe Chiu

unread,
Jan 28, 2016, 3:09:40 AM1/28/16
to Bin Fan, cc, Tachyon Users
Bin,

Thanks for pointing out were to get the information. The documentation makes reference to "This guide describes how to run Apache Spark on Tachyon and use HDFS as a running example of Tachyon under storage system".
It gives an example of using Spark with Tachyon using HDFS and not the undelying File system /disk. I'm really trying to avoid using HDFS using a standalone Spark Cluster running on Tachyon RAMFS. That's what I'm trying to achieve. So I'm basically looking for were to configure this in either Spark or Tachyon.

Best regards,

Panashe

Bin Fan

unread,
Jan 28, 2016, 3:26:49 AM1/28/16
to Nashe Chiu, cc, Tachyon Users
Hi Nashe,

Most instructions of the previous link still apply. You can simply ignore those HDFS related setup.

It is actually simpler. Basically, you need only a few steps:

(1) Setup and run Tachyon standalone:
Mostly like the instructions in

(2) Configure Spark with Tachyon
In this step you may want to check the compatibility matrix
to determine if you need to recompile Spark to use Tachyon
In your case, Tachyon 0.8 should be working out of box with Spark 1.6 without recompilation.

(3) Use Tachyon URI as Spark input/output

e.g.
> val s = sc.textFile("tachyon://localhost:19998/foo")
> val double = s.map(line => line + line)
> double.saveAsTextFile("tachyon://localhost:19998/bar")
You may need to first load data to tachyon, e.g., in here tachyon://localhost:19998/foo
One way you could do it is to copy data from your local file system, using copyFromLocal command provided by Tachyon shell.
You can also download data from Tachyon to your local file system using copyToLocal

Hope those help.

Best,

- Bin
Reply all
Reply to author
Forward
0 new messages