Hello everyone.
I was just going to try Tachyon, and although I made the standalone installation work just straight away when I was tried to use the S3 backend I found both very little information and a lot of problems :-/
My first steps were just configuring the tachyon-env.sh file following the documentation and trying to format the filesystem, but when trying to format it the following error arised:
[root@tachyon1 tachyon]# ./bin/tachyon formatConnection to tachyon2.foo.com... Formatting Tachyon Worker @ tachyon2.foo.comRemoving local data under folder: /var/sds/tachyon/ramdisk/tachyonworker/Connection to tachyon2.foo.com closed.Connection to tachyon3.foo.com... Formatting Tachyon Worker @ tachyon3.foo.comRemoving local data under folder: /var/sds/tachyon/ramdisk/tachyonworker/Connection to tachyon3.foo.com closed.Formatting Tachyon Master @ tachyon1.foo.comFormatting JOURNAL_FOLDER: /var/sds/tachyon/journal/Exception in thread "main" java.lang.NoClassDefFoundError: org/jets3t/service/S3ServiceException at org.apache.hadoop.fs.s3native.NativeS3FileSystem.createDefaultStore(NativeS3FileSystem.java:224) at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:214) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at tachyon.UnderFileSystemHdfs.<init>(UnderFileSystemHdfs.java:89) at tachyon.UnderFileSystemHdfs.getClient(UnderFileSystemHdfs.java:56) at tachyon.UnderFileSystem.get(UnderFileSystem.java:69) at tachyon.UnderFileSystem.get(UnderFileSystem.java:54) at tachyon.Format.formatFolder(Format.java:32) at tachyon.Format.main(Format.java:59)Caused by: java.lang.ClassNotFoundException: org.jets3t.service.S3ServiceException at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 13 more Because it seemed as if it lacked the necessary libs, I downloaded jets3t v0.9.2 and copied the jars to the classpath. Errors have changed since then:
[root@tachyon1 tachyon]# ./bin/tachyon formatConnection to tachyon2.foo.com... Formatting Tachyon Worker @ tachyon2.foo.comRemoving local data under folder: /var/sds/tachyon/ramdisk/tachyonworker/Connection to tachyon2.foo.com closed.Connection to tachyon3.foo.com... Formatting Tachyon Worker @ tachyon3.foo.comRemoving local data under folder: /var/sds/tachyon/ramdisk/tachyonworker/Connection to tachyon3.foo.com closed.Formatting Tachyon Master @ tachyon1.foo.comFormatting JOURNAL_FOLDER: /var/sds/tachyon/journal/Exception in thread "main" java.lang.NoSuchMethodError: org.jets3t.service.impl.rest.httpclient.RestS3Service.<init>(Lorg/jets3t/service/security/AWSCredentials;)V at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.initialize(Jets3tNativeFileSystemStore.java:54) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at org.apache.hadoop.fs.s3native.$Proxy1.initialize(Unknown Source) at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:216) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at tachyon.UnderFileSystemHdfs.<init>(UnderFileSystemHdfs.java:89) at tachyon.UnderFileSystemHdfs.getClient(UnderFileSystemHdfs.java:56) at tachyon.UnderFileSystem.get(UnderFileSystem.java:69) at tachyon.UnderFileSystem.get(UnderFileSystem.java:54) at tachyon.Format.formatFolder(Format.java:32) at tachyon.Format.main(Format.java:59) Where could be the problem? My config file (tachyon-env.sh) is the following:
#!/usr/bin/env bash
# This file contains environment variables required to run Tachyon. Copy it as tachyon-env.sh and
# edit that to configure Tachyon for your site. At a minimum,
# the following variables should be set:
#
# - JAVA_HOME, to point to your JAVA installation
# - TACHYON_MASTER_ADDRESS, to bind the master to a different IP address or hostname
# - TACHYON_UNDERFS_ADDRESS, to set the under filesystem address.
# - TACHYON_WORKER_MEMORY_SIZE, to set how much memory to use (e.g. 1000mb, 2gb) per worker
# - TACHYON_RAM_FOLDER, to set where worker stores in memory data
# - TACHYON_UNDERFS_HDFS_IMPL, to set which HDFS implementation to use (e.g. com.mapr.fs.MapRFileSystem,
# org.apache.hadoop.hdfs.DistributedFileSystem)
# The following gives an example:
if [[ `uname -a` == Darwin* ]]; then
# Assuming Mac OS X
export JAVA_HOME=${JAVA_HOME:-$(/usr/libexec/java_home)}
export TACHYON_RAM_FOLDER=/Volumes/ramdisk
export TACHYON_JAVA_OPTS="-Djava.security.krb5.realm= -Djava.security.krb5.kdc="
else
# Assuming Linux
if [ -z "$JAVA_HOME" ]; then
export JAVA_HOME=/usr/lib/jvm/java-7-oracle
fi
export TACHYON_RAM_FOLDER=/var/sds/tachyon/ramdisk
fi
export JAVA="$JAVA_HOME/bin/java"
export TACHYON_MASTER_ADDRESS=tachyon1.foo.com
export TACHYON_UNDERFS_ADDRESS=s3n://XXXXXXX
export TACHYON_WORKER_MEMORY_SIZE=1GB
export TACHYON_UNDERFS_HDFS_IMPL=org.apache.hadoop.hdfs.DistributedFileSystem
CONF_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
export TACHYON_JAVA_OPTS+="
-Dlog4j.configuration=file:$CONF_DIR/log4j.properties
-Dtachyon.debug=false
-Dtachyon.underfs.address=$TACHYON_UNDERFS_ADDRESS
-Dtachyon.underfs.hdfs.impl=$TACHYON_UNDERFS_HDFS_IMPL
-Dtachyon.data.folder=$TACHYON_UNDERFS_ADDRESS/tmp/tachyon/data
-Dtachyon.workers.folder=$TACHYON_UNDERFS_ADDRESS/tmp/tachyon/workers
-Dtachyon.worker.memory.size=$TACHYON_WORKER_MEMORY_SIZE
-Dtachyon.worker.data.folder=$TACHYON_RAM_FOLDER/tachyonworker/
-Dtachyon.master.worker.timeout.ms=60000
-Dtachyon.master.hostname=$TACHYON_MASTER_ADDRESS
-Dtachyon.master.journal.folder=/var/sds/tachyon/journal/
-Dfs.s3n.awsAccessKeyId=XXXXXXXXXXXXXXXXX
-Dfs.s3n.awsSecretAccessKey=XXXXXXXXXXXXXXXXXXXXXXXX
-Dtachyon.usezookeeper=true
-Dtachyon.zookeeper.address=localhost:2181/tachyon
-Dorg.apache.jasper.compiler.disablejsr199=true
-Djava.net.preferIPv4Stack=true
"
# Master specific parameters. Default to TACHYON_JAVA_OPTS.
export TACHYON_MASTER_JAVA_OPTS="$TACHYON_JAVA_OPTS"
# Worker specific parameters that will be shared to all workers. Default to TACHYON_JAVA_OPTS.
export TACHYON_WORKER_JAVA_OPTS="$TACHYON_JAVA_OPTS"Thank you very much.
Francisco Madrid-Salvador