java.io.IOException: No FileSystem for scheme: tachyon

Hobin Yoon

unread,

May 31, 2013, 5:48:02 PM5/31/13

to tachyo...@googlegroups.com

Hi Haoyuan, could you help me with this error? I am following this: https://github.com/amplab/tachyon/wiki/Running-Tachyon-Locally

root@ts80:~/work/hadoop-1.1.2# bin/hadoop fs -ls /user/root/input/capacity-scheduler.xml

Found 1 items

-rw-r--r-- 1 root supergroup 7457 2013-05-31 17:10 /user/root/input/capacity-scheduler.xml

root@ts80:~/work/spark-0.7.0# ./spark-shell

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/root/work/tachyon/target/tachyon-0.2.1-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/root/work/spark-0.7.0/lib_managed/jars/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

Welcome to

____ __

/ __/__ ___ _____/ /__

_\ \/ _ \/ _ `/ __/ '_/

/___/ .__/\_,_/_/ /_/\_\ version 0.7.0

/_/

Using Scala version 2.9.3 (OpenJDK 64-Bit Server VM, Java 1.7.0_21)

Initializing interpreter...

13/05/31 17:20:56 INFO server.Server: jetty-7.x.y-SNAPSHOT

13/05/31 17:20:56 INFO server.AbstractConnector: Started SocketC...@0.0.0.0:40759 STARTING

Creating SparkContext...

13/05/31 17:21:03 INFO slf4j.Slf4jEventHandler: Slf4jEventHandler started

13/05/31 17:21:03 INFO storage.BlockManagerMaster: Registered BlockManagerMaster Actor

13/05/31 17:21:03 INFO storage.MemoryStore: MemoryStore started with capacity 323.9 MB.

13/05/31 17:21:03 INFO storage.DiskStore: Created local directory at /tmp/spark-local-20130531172103-6abf

13/05/31 17:21:03 INFO network.ConnectionManager: Bound socket to port 48098 with id = ConnectionManagerId(ts80,48098)

13/05/31 17:21:03 INFO storage.BlockManagerMaster: Trying to register BlockManager

13/05/31 17:21:03 INFO storage.BlockManagerMasterActor$BlockManagerInfo: Registering block manager ts80:48098 with 323.9 MB RAM

13/05/31 17:21:03 INFO storage.BlockManagerMaster: Registered BlockManager

13/05/31 17:21:03 INFO server.Server: jetty-7.x.y-SNAPSHOT

13/05/31 17:21:03 INFO server.AbstractConnector: Started SocketC...@0.0.0.0:35020 STARTING

13/05/31 17:21:03 INFO broadcast.HttpBroadcast: Broadcast server started at http://130.207.110.117:35020

13/05/31 17:21:03 INFO spark.MapOutputTracker: Registered MapOutputTrackerActor actor

13/05/31 17:21:03 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-078c4401-23fb-4372-873e-92090b027690

13/05/31 17:21:03 INFO server.Server: jetty-7.x.y-SNAPSHOT

13/05/31 17:21:03 INFO server.AbstractConnector: Started SocketC...@0.0.0.0:57010 STARTING

13/05/31 17:21:03 INFO io.IoWorker: IoWorker thread 'spray-io-worker-0' started

13/05/31 17:21:04 INFO server.HttpServer: akka://spark/user/BlockManagerHTTPServer started on /0.0.0.0:41214

13/05/31 17:21:04 INFO storage.BlockManagerUI: Started BlockManager web UI at http://ts80:41214

Spark context available as sc.

Type in expressions to have them evaluated.

Type :help for more information.

scala> val s = sc.textFile("tachyon://localhost:19998/user/root/input/capacity-scheduler.xml")

13/05/31 17:30:44 INFO storage.MemoryStore: ensureFreeSpace(56931) called with curMem=0, maxMem=339585269

13/05/31 17:30:44 INFO storage.MemoryStore: Block broadcast_0 stored as values to memory (estimated size 55.6 KB, free 323.8 MB)

s: spark.RDD[String] = MappedRDD[1] at textFile at <console>:12

scala> s.count()

13/05/31 17:31:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

13/05/31 17:31:12 WARN snappy.LoadSnappy: Snappy native library not loaded

java.io.IOException: No FileSystem for scheme: tachyon

at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1408)

at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)

at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429)

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)

at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)

at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:176)

at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)

at spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:52)

at spark.RDD.partitions(RDD.scala:168)

at spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:9)

at spark.RDD.partitions(RDD.scala:168)

at spark.SparkContext.runJob(SparkContext.scala:624)

at spark.RDD.count(RDD.scala:490)

at <init>(<console>:15)

at <init>(<console>:20)

at <init>(<console>:22)

at <init>(<console>:24)

at <init>(<console>:26)

at .<init>(<console>:30)

at .<clinit>(<console>)

at .<init>(<console>:11)

at .<clinit>(<console>)

at $export(<console>)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:601)

at spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:629)

at spark.repl.SparkIMain$Request$$anonfun$10.apply(SparkIMain.scala:890)

at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43)

at scala.tools.nsc.io.package$$anon$2.run(package.scala:25)

at java.lang.Thread.run(Thread.java:722)

Hobin

Hobin Yoon

unread,

May 31, 2013, 5:59:30 PM5/31/13

to tachyo...@googlegroups.com

I see other files from bin/run-all-tests.sh though.

root@ts80:~/work/tachyon# bin/tachyon tfs ls /

80.00 B 05-31-2013 16:55:51:690 In Memory /Basic_File_WRITE_CACHE

0.00 B 05-31-2013 16:55:52:263 Not In Memory /Basic_Raw_Table_WRITE_CACHE

80.00 B 05-31-2013 16:55:52:864 In Memory /Basic_File_WRITE_CACHE_THROUGH

0.00 B 05-31-2013 16:55:53:440 Not In Memory /Basic_Raw_Table_WRITE_CACHE_THROUGH

80.00 B 05-31-2013 16:55:54:050 In Memory /Basic_File_WRITE_THROUGH

0.00 B 05-31-2013 16:55:54:638 Not In Memory /Basic_Raw_Table_WRITE_THROUGH

Hobin

Haoyuan Li

unread,

May 31, 2013, 6:14:56 PM5/31/13

to Hobin Yoon, tachyo...@googlegroups.com

Can you paste your "spark/conf/core-site.xml" ?

--
You received this message because you are subscribed to the Google Groups "Tachyon Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tachyon-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hobin Yoon

unread,

May 31, 2013, 6:50:10 PM5/31/13

to Haoyuan Li, tachyo...@googlegroups.com

I don't have the file. I thought it was a typo... since there was no spark/conf/core-site.xml and edited hadoop/conf/core-site.xml.

root@ts80:~/work/hadoop-1.1.2# cat conf/core-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

<name>fs.tachyon.impl</name>

<value>tachyon.hadoop.TachyonFileSystem</value>

</property>

</configuration>

Should I have created spark/conf/core-site.xml instead?

Hobin

Haoyuan Li

unread,

Jun 1, 2013, 2:14:03 AM6/1/13

to Hobin Yoon, tachyo...@googlegroups.com

Yes. You can try to do

cp hadoop-1.1.2/conf/*.xml tachyon/conf/.

Hobin Yoon

unread,

Jun 1, 2013, 2:44:23 PM6/1/13

to Haoyuan Li, tachyo...@googlegroups.com

I copied xml files and started tachyon with bin/start-local.sh.

root@ts80:~/work/tachyon# bin/start-local.sh

Killed 0 processes

Killed 1 processes

localhost: Killed 0 processes

Mounting ramfs on Linux...

TACHYON_RAM_FOLDER was not set. Using the default one: /mnt/ramdisk

Formatting RamFS: /mnt/ramdisk

Starting master @ localhost

Starting worker @ ts80

But tachyon master still does not start.

root@ts80:~/work/tachyon# jps -l | grep tachyon

26243 tachyon.Worker

The master log says it can not connect to 54310.

root@ts80:~/work/tachyon/logs# cat maste...@130.207.110.117_06-01-2013

...

2013-06-01 14:30:09,978 INFO ipc.Client (Client.java:handleConnectionFailure) - Retrying connect to server: localhost/127.0.0.1:54310. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

2013-06-01 14:30:09,981 ERROR (CommonUtils.java:runtimeException) - Call to localhost/127.0.0.1:54310 failed on connection exception: java.net.ConnectException: Connection refused

java.net.ConnectException: Call to localhost/127.0.0.1:54310 failed on connection exception: java.net.ConnectException: Connection refused

...

root@ts80:~/work/tachyon# cat conf/tachyon-env.sh

...

export TACHYON_HDFS_ADDRESS=hdfs://localhost:54310

...

Hobin

Haoyuan Li

unread,

Jun 1, 2013, 2:47:08 PM6/1/13

to Hobin Yoon, tachyo...@googlegroups.com

Is HDFS running? Is the version the same as the Tachyon you compiled to? Is this pre-built Tachyon? If not, what's the Hadoop version line in pom.xml?

Hobin Yoon

unread,

Jun 1, 2013, 2:55:48 PM6/1/13

to Haoyuan Li, tachyo...@googlegroups.com

HDFS is running.

root@ts80:~/work/tachyon# jps -l | sort -k 2

28108 org.apache.hadoop.hdfs.server.datanode.DataNode

27845 org.apache.hadoop.hdfs.server.namenode.NameNode

28373 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode

28474 org.apache.hadoop.mapred.JobTracker

28739 org.apache.hadoop.mapred.TaskTracker

27381 sun.tools.jps.Jps

27250 tachyon.Worker

Hadoop version is 1.1.2 and yes it is the same version that I built tachyon against.

root@ts80:~/work/tachyon# cat pom.xml

...

<id>hadoop1</id>

</activation>

<classifier>hadoop1</classifier>

<hadoop.major.version>1</hadoop.major.version>

<hadoop.version>1.1.2</hadoop.version>

</properties>

<groupId>org.apache.hadoop</groupId>

<artifactId>hadoop-core</artifactId>

<version>${hadoop.version}</version>

</dependency>

</dependencies>

</profile>

...

Tachyon was built from the source code. I cloned the git repository and checked out version 0.2.1.

Hobin Yoon

unread,

Jun 1, 2013, 8:22:44 PM6/1/13

to Haoyuan Li, tachyo...@googlegroups.com

So it was the HDFS port number. The default port number of Hadoop name node is 9000 as of Hadoop-1.1.2. It is working after specifying the port in the tachyon-env.sh!

root@ts80:~/work/tachyon/conf# cat tachyon-env.sh

...

export TACHYON_HDFS_ADDRESS=hdfs://localhost:9000

...

root@ts80:~/work/tachyon# bin/format.sh
Formatting Tachyon @ localhost
13/06/01 20:06:54 INFO : Deleting hdfs://localhost:9000/tachyon/tachyon_checkpoint.data
13/06/01 20:06:54 INFO : Deleting hdfs://localhost:9000/tachyon/tachyon_log.data
13/06/01 20:06:54 INFO : Formatting hdfs://localhost:9000/tachyon/data
13/06/01 20:06:54 INFO : Formatting hdfs://localhost:9000/tachyon/workers

root@ts80:~/work/tachyon# bin/start-local.sh
Killed 0 processes
Killed 1 processes
localhost: Killed 0 processes
Mounting ramfs on Linux...
TACHYON_RAM_FOLDER was not set. Using the default one: /mnt/ramdisk
Formatting RamFS: /mnt/ramdisk
Starting master @ localhost
Starting worker @ ts80

root@ts80:~/work/spark-0.7.0# ./spark-shell

...

scala> val s = sc.textFile("tachyon://localhost:19998/user/root/input/capacity-scheduler.xml")

scala> s.count()

...

res1: Long = 178

Hobin

Aakanksha

unread,

Aug 31, 2015, 5:30:55 PM8/31/15

to Tachyon Users, haoyu...@gmail.com

Hello Haoyuan and Hobin,

I am facing the same issue when I try to run mapreduce with tachyon. My hadoop version is 2.7.1 and tachyon version is 0.7.0. I recompiled tachyon with 2.7.1 and also followed steps to add the tachyon dependencies to the hadoop classpath. When I try to run the wordcount example mentioned in the page, I run into this error:

java.io.IOException: No FileSystem for scheme: tachyon

at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2644)

at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651)

at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)

at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)

at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)

at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)

at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.addInputPath(FileInputFormat.java:520)

at org.apache.hadoop.examples.WordCount.main(WordCount.java:83)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)

at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)

at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.run(RunJar.java:221)

at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

I checked and the hdfs port mentioned in tachyon-env.sh is correct. This is a single node cluster. Could you point out other possible sources of this error?

Thanks,

Aakanksha

Haoyuan Li

unread,

Sep 1, 2015, 1:57:02 AM9/1/15

to Aakanksha, Tachyon Users

Aakanksha,

You need to set your Spark or MapReduce applications configuration to let them understand `tachyon://` uri.

e.g. something like this in spark:

val conf = new SparkConf()
val sc = new SparkContext(conf)
sc.hadoopConfiguration.set("fs.tachyon.impl", "tachyon.hadoop.TFS")

You can find more info here: http://tachyon-project.org/documentation/Running-Spark-on-Tachyon.html

Hope this helps.

Haoyuan

Aakanksha

unread,

Sep 2, 2015, 7:28:44 PM9/2/15

to Tachyon Users, p.aakank...@gmail.com

Hello Haoyuan,

I added the configuration in my core-site.xml and it worked. Thanks for your help! Earlier I had not added it since my hadoop cluster is 2.x. But now I see a fresh issue. When I try the example in the page, i.e. wordcount or teragen, I get this error:

15/09/01 20:10:35 WARN mapred.LocalJobRunner: job_local1659117566_0001

java.lang.Exception: java.util.ServiceConfigurationError: tachyon.underfs.UnderFileSystemFactory: Provider tachyon.underfs.hdfs.HdfsUnderFileSystemFactory not a subtype

at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)

at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)

Caused by: java.util.ServiceConfigurationError: tachyon.underfs.UnderFileSystemFactory: Provider tachyon.underfs.hdfs.HdfsUnderFileSystemFactory not a subtype

at java.util.ServiceLoader.fail(ServiceLoader.java:231)

at java.util.ServiceLoader.access$300(ServiceLoader.java:181)

at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:369)

at java.util.ServiceLoader$1.next(ServiceLoader.java:445)

at tachyon.underfs.UnderFileSystemRegistry.init(UnderFileSystemRegistry.java:190)

at tachyon.underfs.UnderFileSystemRegistry.<clinit>(UnderFileSystemRegistry.java:83)

at tachyon.underfs.UnderFileSystem.get(UnderFileSystem.java:99)

at tachyon.client.TachyonFS.createAndGetUserUfsTempFolder(TachyonFS.java:300)

at tachyon.client.FileOutStream.<init>(FileOutStream.java:70)

at tachyon.client.TachyonFile.getOutStream(TachyonFile.java:241)

at tachyon.hadoop.AbstractTFS.create(AbstractTFS.java:138)

at tachyon.hadoop.TFS.create(TFS.java:26)

at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)

at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)

at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)

at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:776)

at org.apache.hadoop.examples.terasort.TeraOutputFormat.getRecordWriter(TeraOutputFormat.java:124)

at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:647)

at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)

at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

15/09/01 20:10:36 INFO mapreduce.Job: Job job_local1659117566_0001 running in uber mode : false

I am not sure what I am doing wrong. I even checked with a fresh tachyon build and the error still appears. Any help in this matter is appreciated.

Thanks,

Aakanksha

ji...@yahoo-inc.com

unread,

Sep 16, 2015, 1:52:21 PM9/16/15

to Tachyon Users, p.aakank...@gmail.com

Facing the same Error

...

Narayanan K

unread,

Sep 16, 2015, 2:39:24 PM9/16/15

to Tachyon Users, p.aakank...@gmail.com

Yeah. Same error while trying to run wordcount MR job with data on HDFS . We use Hadoop 2.6 and compiled from source with our hadoop version.

java.lang.Exception: java.util.

ServiceConfigurationError: tachyon.underfs.UnderFileSystemFactory: Provider tachyon.underfs.hdfs.HdfsUnderFileSystemFactory not a subtype

Any suggestions as to what is happening here and how to fix this ?

Thanks
Narayanan

Calvin Jia

unread,

Sep 16, 2015, 4:20:46 PM9/16/15

to Tachyon Users, p.aakank...@gmail.com

Hi,

Could you provide the classpath of the job you are running? This problem should be due to some version conflicts.

Thanks,

Calvin

Aakanksha

unread,

Sep 16, 2015, 8:22:43 PM9/16/15

to Tachyon Users, p.aakank...@gmail.com

Hello Calvin,

It turned out to be a silly mistake in my case. I forgot to correct the default path to my tachyon path in this step:

$ export HADOOP_CLASSPATH=/pathToTachyon/clients/client/target/tachyon-client-0.7.1-jar-with-dependencies.jar

Thanks a lot for your response!

Aakanksha

Calvin Jia

unread,

Sep 16, 2015, 8:39:23 PM9/16/15

to Tachyon Users, p.aakank...@gmail.com

Hi Aakanksha, thanks for the update and posting your solution!

Reply all

Reply to author

Forward