Query regarding using tachyon 0.8.2 with spark-1.6

68 views
Skip to first unread message

SUBRAMANYESWARA RAO BHAVIRISETTY

unread,
Feb 8, 2016, 6:26:17 PM2/8/16
to Tachyon Users
Hi all,

  I am new to the tachyon and spark world.
I am trying to setup tachyon-0.8.2 with spark-1.6 on cluster mode. My requirement is as follows:

1> We have data related to users on grid(hdfs) which is in json-ld format. (Like there will be an entity say Brad Pitt and we store all the attributes of BradPitt like height,marriages etc in JSONLD format)
2> We want to run some queries on that data from spark-shell after integrating with tachyon. The reason behind this is that we don't want to read from HDFS everytime, instead query the data from Tachyon.
3> For this I am doing the following as per the link http://tachyon-project.org/documentation/Running-Spark-on-Tachyon.html
a) Install spark-1.6 and tachyon-0.8.2 as they both are compliant and they work out of the box.
b) Our cluster has hadoop-2.x installed.

Can you please let me know what should be done so that I can make sure that I would be able to query the data from tachyon instead of HDFS from the spark shell? The reason I am asking this is I only see the documentation for Hadoop-1.x in this link(http://tachyon-project.org/documentation/Running-Spark-on-Tachyon.html)

Thanks in advance.

~Subramanyam

Gene Pang

unread,
Feb 9, 2016, 12:05:16 AM2/9/16
to Tachyon Users
Hi Subramanyam,

I think it should be relatively simple to deploy Tachyon in your use case. First, you will probably have to recompile Tachyon to use the hadoop version you are using: http://tachyon-project.org/documentation/v0.8.2/Configuring-Tachyon-with-HDFS.html

For details for other distributions and versions, you can see the the guide here: http://tachyon-project.org/documentation/v0.8.2/Building-Tachyon-Master-Branch.html#distro-support

Spark 1.6 should already be compiled to work with Tachyon 0.8.2. Once you deploy Tachyon (http://tachyon-project.org/documentation/v0.8.2/Running-Tachyon-on-a-Cluster.html), you should be able to access your HDFS data through Tachyon, and be able to take advantage of the in-memory storage.

Once you set Tachyon up, your Spark programs only need to change the URI to Tachyon, instead of HDFS. For example, if your program was reading and writing to files "hdfs://<ip>:<port>/path/files", you should be able to access them through Tachyon with "tachyon://<ip>:<port>/path/files".

Hope that helps,
Gene

SUBRAMANYESWARA RAO BHAVIRISETTY

unread,
Feb 9, 2016, 4:29:10 PM2/9/16
to Tachyon Users
Hi Gene,

  Thanks for your reply. I followed the steps you suggested.
I was following the link "http://tachyon-project.org/documentation/v0.8.2/Running-Tachyon-on-a-Cluster.html" and trying to execute the command "
./bin/tachyon-start.sh # use the right parameters here. e.g. all Mount".

Formatting RamFS: /mnt/ramdisk (1gb)
mkdir: cannot create directory `/mnt/ramdisk': Permission denied
mount: only root can do that
chmod: cannot access `/mnt/ramdisk': No such file or directory
Mount failed, not starting worker

Can you please let me know if we do need to run this when we start tachyon everytime? As of now I am going ahead with NoMount option.
Thanks in advance.

~Subramanyam

Gene Pang

unread,
Feb 9, 2016, 4:33:15 PM2/9/16
to Tachyon Users
Tachyon workers need to mount the ramdisk, so in order to mount, it needs to use 'sudo'.

Therefore, you should run: bin/tachyon-start.sh all SudoMount

to start Tachyon.

Thanks,
Gene

Jiří Šimša

unread,
Feb 9, 2016, 4:35:15 PM2/9/16
to Gene Pang, Tachyon Users
Hi Subramanyam,

To add to Gene's answer, you only need to mount the ramdisk once. If it is already mounted, you can start Tachyon with "bin/tachyon-start.sh all NoMount", which will not require root privilege.

Best,

--
You received this message because you are subscribed to the Google Groups "Tachyon Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tachyon-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Jiří Šimša

SUBRAMANYESWARA RAO BHAVIRISETTY

unread,
Feb 9, 2016, 6:23:40 PM2/9/16
to Tachyon Users
Hi Gene,

  Thanks for the reply. I have executed the sudoMount command and then I started running "./run.sh runTests" from the master machine.

I am getting the following error:

2016-02-09 23:05:24,213 ERROR  (Utils.java:runExample) - Exception running test: tachyon.examples.BasicNonByteBufferOperations@6b09bb57
java.io.IOException: TachyonTException(type:FAILED_TO_CHECKPOINT, message:Failed to rename /tmp/tachyon-0.8.2/underFSStorage/default_tests_files/Basi)
        at tachyon.worker.WorkerClient.persistFile(WorkerClient.java:126)
        at tachyon.client.file.FileOutStream.close(FileOutStream.java:144)
        at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
        at tachyon.examples.BasicNonByteBufferOperations.write(BasicNonByteBufferOperations.java:95)
        at tachyon.examples.BasicNonByteBufferOperations.call(BasicNonByteBufferOperations.java:77)
        at tachyon.examples.BasicNonByteBufferOperations.call(BasicNonByteBufferOperations.java:52)
        at tachyon.examples.Utils.runExample(Utils.java:133)
        at tachyon.examples.BasicNonByteBufferOperations.main(BasicNonByteBufferOperations.java:143)
Caused by: TachyonTException(type:FAILED_TO_CHECKPOINT, message:Failed to rename /tmp/tachyon-0.8.2/underFSStorage/default_tests_files/BasicNonByteBu)
        at tachyon.thrift.WorkerService$persistFile_result$persistFile_resultStandardScheme.read(WorkerService.java:6887)
        at tachyon.thrift.WorkerService$persistFile_result$persistFile_resultStandardScheme.read(WorkerService.java:6873)
        at tachyon.thrift.WorkerService$persistFile_result.read(WorkerService.java:6823)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
        at tachyon.thrift.WorkerService$Client.recv_persistFile(WorkerService.java:322)
        at tachyon.thrift.WorkerService$Client.persistFile(WorkerService.java:307)
        at tachyon.worker.WorkerClient.persistFile(WorkerClient.java:124)
        ... 7 more

I am seeing the following error in the master logs.

2016-02-09 23:21:11,041 WARN  MASTER_LOGGER (FileSystemMaster.java:deleteFileInternal) - File does not exist the underfs: /home/bsrao/tachyon-0.8.2/underFSStorage/default_tests_files/BasicFile_NO_CACHE_CACHE_THROUGH
2016-02-09 23:21:12,071 WARN  MASTER_LOGGER (FileSystemMaster.java:deleteFileInternal) - File does not exist the underfs: /home/bsrao/tachyon-0.8.2/underFSStorage/default_tests_files/BasicNonByteBuffer_NO_CACHE_CACHE_THROUGH

I tried creating files but the tests still fail.

Can you please let me know how should I resolve this issue?

~Subramanyam

Jiří Šimša

unread,
Feb 9, 2016, 6:58:01 PM2/9/16
to SUBRAMANYESWARA RAO BHAVIRISETTY, Tachyon Users
Hi Subramanyam,

Are you using a standalone deployment of Tachyon (single master and worker running on the same node) or not? If not, you need to specify a distributed file system (e.g. S3, HDFS or NFS) as under file system for Tachyon (take a look at the Under Store section here).

Best,

--
You received this message because you are subscribed to the Google Groups "Tachyon Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tachyon-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

SUBRAMANYESWARA RAO BHAVIRISETTY

unread,
Feb 9, 2016, 9:52:23 PM2/9/16
to Tachyon Users, sbhavi...@gmail.com
Hi,

  Thanks for your reply. I have one more question. When we try to access hdfs from tachyon, I guess we need to set the following paramters for the kinit:

tachyon.master.keytab.file
tachyon.master.principal
tachyon.worker.keytab.file
tachyon.worker.principal

Can you please let me know if the above are the parameters which need to be set or if something else needs to be done to make sure tachyon can access hdfs using kinit? Also, could you please let me know how will tachyon handle the kinit expiry?

~Subramanyam

Jiří Šimša

unread,
Feb 9, 2016, 10:05:46 PM2/9/16
to SUBRAMANYESWARA RAO BHAVIRISETTY, Tachyon Users
What is your configuration? Is your Hadoop cluster using Kerberos for authentication?

SUBRAMANYESWARA RAO BHAVIRISETTY

unread,
Feb 10, 2016, 11:29:43 AM2/10/16
to Tachyon Users, sbhavi...@gmail.com
Yes, it's using kerberos for authentication.

Jiří Šimša

unread,
Feb 10, 2016, 11:34:53 AM2/10/16
to SUBRAMANYESWARA RAO BHAVIRISETTY, Tachyon Users
Hi Subramanyam, 

If you are using Kerberos, then those are the properties that you should set.

Best,

SUBRAMANYESWARA RAO BHAVIRISETTY

unread,
Feb 10, 2016, 2:38:41 PM2/10/16
to Tachyon Users, sbhavi...@gmail.com
Hi Jiri,

  Thanks a lot for the help in resolving the issues. I have one more question regarding ramfs.
Yesterday I had to create separate ramfs partition after which only I could start the tachyon server. The machines I am using have around 500G memory. So complete data could be stored in memory. I guess the data stored in ramfs would get lost if the machine gets rebooted. Can you please let me know if there's way to restart tachyon servers without ramfs?

Also, I see that there's no way to load from underlayerFS to memory as per this bug https://tachyon.atlassian.net/browse/TACHYON-494.
Can you please let me know if it's still the case?

~Subramanyam

Jiří Šimša

unread,
Feb 10, 2016, 4:25:47 PM2/10/16
to SUBRAMANYESWARA RAO BHAVIRISETTY, Tachyon Users
Hi Subramanyam,

I am not sure I understand your question: "Can you please let me know if there's way to restart tachyon servers without ramfs?"

I think you are asking if there is a way to prevent losing the data stored in Tachyon if the machine on which the Tachyon worker is running is restarted. There is multiple solutions for this:

1) You could configure Tachyon to use persistent storage instead RAM. Take a look at this documentation that explains how to configure Tachyon with different types of (tiered) storage.

2) When writing data to Tachyon, you can specify the "write type" of the I/O to be one of MUST_CACHE (the default), CACHE_THROUGH, THROUGH, and ASYNC_THROUGH. MUST_CACHE writes data to Tachyon, CACHE_THROUGH writes data to Tachyon and also synchronously to under storage (e.g. HDFS), THROUGH synchronously writes data only to under storage, finally ASYNC_THROUGH writes data to Tachyon and asynchronously propagates it to under storage.

As for your other question, you should be able to load data manually from under storage into tachyon using "./bin/tachyon loadufs". You could also only load metadata using "./bin/tachyon tfs loadMetadata" and have Tachyon load the data when needed at runtime.

Best,

SUBRAMANYESWARA RAO BHAVIRISETTY

unread,
Feb 10, 2016, 7:33:14 PM2/10/16
to Tachyon Users, sbhavi...@gmail.com
Hi,

  Sorry for the confusion. I was asking about how we could make sure that data in tachyon is not lost if worker node gets rebooted.

We want to do the following:

1> Data which we want to store in tachyon is in hdfs is around 200G.
2> We would like to store the data in tachyon (3 cluster nodes with 256G memory , SSD and 3TB harddisk space) and run queries from spark-shell by using tachyon url.

It's good to know that we have command line tools to load tachyon from hdfs. These tools would be very handy.
Can you please let me know how much time would it take to load 200G data using the command line tool?

~Subramanyam

~Subramanyam

Jiří Šimša

unread,
Feb 11, 2016, 10:31:28 AM2/11/16
to SUBRAMANYESWARA RAO BHAVIRISETTY, Tachyon Users
Hi Subramanyam,

Your best bet is to either use CACHE_THROUGH or ASYNC_THROUGH. The CACHE_THROUGH mode will guarantee that no data is lost when Tachyon worker is rebooted as the data will be also written to the under storage and when Tachyon master detects that the data is not available in the Tachyon worker (because its machine was rebooted), then Tachyon will fetch it from under storage. The ASYCN_THROUGH mode provides a weaker guarantee as it aims to strike a balance between the write performance and the durability guarantee, the write will happen at memory speed and the data will be written to under storage asynchronously. If the Tachyon worker machine is rebooted in the time window after the data is written to Tachyon but before it is asynchronously written out to under storage, it will be lost.

How much time it would take to load 200G data depends on several factors, the most important will likely be the network bandwidth between HDFS and Tachyon or the disk bandwidth on the HDFS machines as these are likely going to be the bottleneck. If the network bandwidth available for the transfer is 1Gbps Ethernet, then the lower estimate for the transfer will be 200 GB / 1Gbps ~ 1600 seconds. The same math applies for disk bandwidth (assuming a sequential transfer). I would suggest you try to move a smaller amount of data first to estimate how long the 200G data set transfer will take.

Best,
Reply all
Reply to author
Forward
0 new messages