Thanks for replying!! Unless I am mistaken, which is possible, there is no way to say "No matter what, always use tachyon" right?
Wordcount and grep have input output arguments so it is easy to use tachyon with them but TestDFSIO does not and will go for the defaultFS defined in core-site.xml.
Also the TestDFSIO which is used as a unit test didn't do what I wanted since it wouldn't use my already deployed Tachyon cluster right? (Again, could be mistaken)
So setting tachyon as the defaultFS and then trying to run TestDFSIO would produce the following.
15/03/22 13:33:58 DEBUG security.UserGroupInformation: PrivilegedActionException as:kerkinos (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: tachyon
15/03/22 13:33:58 INFO mapreduce.Cluster: Failed to use org.apache.hadoop.mapred.YarnClientProtocolProvider due to error: Error in instantiating YarnClient
15/03/22 13:33:58 DEBUG mapreduce.Cluster: Trying ClientProtocolProvider : org.apache.hadoop.mapred.LocalClientProtocolProvider
15/03/22 13:33:58 DEBUG mapreduce.Cluster: Cannot pick org.apache.hadoop.mapred.LocalClientProtocolProvider as the ClientProtocolProvider - returned null protocol
java.io.IOException: Cannot initialize Cluster. Please check your configuration for
mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:470)
at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:449)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:832)
at org.apache.hadoop.fs.TestDFSIO.runIOTest(TestDFSIO.java:443)
at org.apache.hadoop.fs.TestDFSIO.writeTest(TestDFSIO.java:425)
at org.apache.hadoop.fs.TestDFSIO.run(TestDFSIO.java:755)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.TestDFSIO.main(TestDFSIO.java:650)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:118)
at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:126)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
15/03/22 13:33:58 DEBUG : Disconnecting from the master localhost/
127.0.0.1:1999815/03/22 13:33:58 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@519d5d83
So my problem here was Yarn, since changing the following value to local would work with no errors.
<property>
<value>yarn</value>
</property>
So I tried looking a bit in the tachyon code and similar issues from other filesystems and found that adding the following property into core-site.xml should work.
<property>
<name>fs.AbstractFileSystem.tachyon.impl</name>
<value>tachyon.hadoop.AbstractTFS</value>
</property>