Hi all,
I am new to the tachyon and spark world.
I am trying to setup tachyon-0.8.2 with spark-1.6 on cluster mode. My requirement is as follows:
1> We have data related to users on grid(hdfs) which is in json-ld format. (Like there will be an entity say Brad Pitt and we store all the attributes of BradPitt like height,marriages etc in JSONLD format)
2> We want to run some queries on that data from spark-shell after integrating with tachyon. The reason behind this is that we don't want to read from HDFS everytime, instead query the data from Tachyon.
3> For this I am doing the following as per the link
http://tachyon-project.org/documentation/Running-Spark-on-Tachyon.htmla) Install spark-1.6 and tachyon-0.8.2 as they both are compliant and they work out of the box.
b) Our cluster has hadoop-2.x installed.
Can you please let me know what should be done so that I can make sure that I would be able to query the data from tachyon instead of HDFS from the spark shell? The reason I am asking this is I only see the documentation for Hadoop-1.x in this link(
http://tachyon-project.org/documentation/Running-Spark-on-Tachyon.html)
Thanks in advance.
~Subramanyam