HBase dose not start with tachyon

23 katselukertaa
Siirry ensimmäiseen lukemattomaan viestiin

Daniel

lukematon,
16.7.2014 klo 7.18.1616.7.2014
vastaanottaja tachyo...@googlegroups.com
Hello, 

I have a some trouble in my Hbase cluster start with Tachyon

Sequence. 

1. update  hadoop configure , both core-site.xml and hdfs-site.xml
    <property>
        <name>fs.tachyon.impl</name>
        <value>tachyon.hadoop.TFS</value>
    </property>

2. add a tachyon-0.4.1-thrift-jar-with-dependencies.jar file in hadoop lib folder

3. start hadoop cluster(with ha, 2.4.0)

4. start tachyon on a cluster mode.

   change underfs from local to hdfs 
   TACHYON_UNDERFS_ADDRESS=hdfs://hostname 

5. update hbase configure, hbase-site.xml
    <property>
        <name>fs.tachyon.impl</name>
        <value>tachyon.hadoop.TFS</value>
    </property>
<property>
        <name>hbase.rootdir</name>
        <value>tachyon://hostname:19998/hbase</value>
    </property>
  
6.. add a tachyon-0.4.1-thrift-jar-with-dependencies.jar file in hbase lib folder

7. start hbase

Does not work. 
Here is the log. 


2014-07-16 20:06:29,251 INFO  [master:BDB00:60000] : FileDoesNotExistException(message:/hbase/data/hbase/meta/1588230740)/hbase/data/hbase/meta/1588230740
2014-07-16 20:06:29,251 INFO  [master:BDB00:60000] : File does not exist: tachyon://BDB00:19998/hbase/data/hbase/meta/1588230740
2014-07-16 20:06:29,252 INFO  [master:BDB00:60000] : create(tachyon://BDB00:19998/hbase/hbase.version, rw-r--r--, true, 16384, 1, 33554432, null)
2014-07-16 20:06:29,252 WARN  [master:BDB00:60000] : tachyon.home is not set. Using /mnt/tachyon_default_home as the default value.
2014-07-16 20:06:29,286 WARN  [master:BDB00:60000] : Fail to cache for: The machine does not have any local worker.
2014-07-16 20:06:29,295 WARN  [master:BDB00:60000] util.FSUtils: Unable to create version file at tachyon://BDB00:19998/hbase, retrying
java.io.IOException: FailedToCheckpointException(message:Failed to rename /home/hdfs/tachyon-0.4.1/libexec/../underfs/tmp/tachyon/workers/1405424004604/8/20 to /home/hdfs/tachyon-0.4.1/libexec/../underfs/tmp/tachyon/data/20)
    at tachyon.worker.WorkerClient.addCheckpoint(WorkerClient.java:83)
    at tachyon.client.TachyonFS.addCheckpoint(TachyonFS.java:156)
    at tachyon.client.FileOutStream.close(FileOutStream.java:205)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:70)
    at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:103)
    at org.apache.hadoop.hbase.util.FSUtils.setVersion(FSUtils.java:650)
    at org.apache.hadoop.hbase.util.FSUtils.setVersion(FSUtils.java:629)
    at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:587)
    at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:456)
    at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:147)
    at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:128)
    at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:802)
    at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: FailedToCheckpointException(message:Failed to rename /home/hdfs/tachyon-0.4.1/libexec/../underfs/tmp/tachyon/workers/1405424004604/8/20 to /home/hdfs/tachyon-0.4.1/libexec/../underfs/tmp/tachyon/data/20)
    at tachyon.thrift.WorkerService$addCheckpoint_result$addCheckpoint_resultStandardScheme.read(WorkerService.java:2687)
    at tachyon.thrift.WorkerService$addCheckpoint_result$addCheckpoint_resultStandardScheme.read(WorkerService.java:2655)
    at tachyon.thrift.WorkerService$addCheckpoint_result.read(WorkerService.java:2581)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
    at tachyon.thrift.WorkerService$Client.recv_addCheckpoint(WorkerService.java:148)
    at tachyon.thrift.WorkerService$Client.addCheckpoint(WorkerService.java:134)
    at tachyon.worker.WorkerClient.addCheckpoint(WorkerClient.java:77)
    ... 13 more
2014-07-16 20:06:29,297 INFO  [master:BDB00:60000] : delete(tachyon://BDB00:19998/hbase/hbase.version, false)
2014-07-16 20:06:39,299 INFO  [master:BDB00:60000] : create(tachyon://BDB00:19998/hbase/hbase.version, rw-r--r--, true, 16384, 1, 33554432, null)
2014-07-16 20:06:39,352 WARN  [master:BDB00:60000] : Fail to cache for: The machine does not have any local worker.
2014-07-16 20:06:39,353 WARN  [master:BDB00:60000] util.FSUtils: Unable to create version file at tachyon://BDB00:19998/hbase, retrying



PS. 

Is it possible tachyon with hadoop ha like below?

TACHYON_UNDERFS_ADDRESS=hdfs://ClusterName

I have already added a hadoop configure file like core-site.xml,hdfs-site.xml in tachyon classpath, but failed.

Calvin Jia

lukematon,
17.7.2014 klo 2.11.0317.7.2014
vastaanottaja tachyo...@googlegroups.com
Hi Daniel,

Could you provide your tachyon-env.sh? The logs seem to indicate that the tachyon data folder is pointing to a location on the local filesystem (which shouldn't be the case since you specified underfs to be hdfs).

Also I noticed you are using hadoop 2.4. Tachyon by default is compatible with hadoop-1.0.4, and for other versions it is recommended to recompile it with the appropriate version. You can do this by modifying the hadoop version in the pom or specifying the flag -Dhadoop.version when compiling with maven.

Daniel

lukematon,
17.7.2014 klo 3.14.2817.7.2014
vastaanottaja tachyo...@googlegroups.com
Thank you for your first reply.

Great thanks. 

I have already compiled tachyon using hadoop 2.4.0 by updating pom.xml
(mvn -Dhadoop.version=2.4.0 clean package)

Here is my tachyon-env.sh

export JAVA_HOME=/usr/java/latest
export TACHYON_RAM_FOLDER=/data13/ramdisk
export JAVA="$JAVA_HOME/bin/java"
export TACHYON_MASTER_ADDRESS=BDB00
#export TACHYON_UNDERFS_ADDRESS=$TACHYON_HOME/underfs
export TACHYON_UNDERFS_ADDRESS=hdfs://BDB00:8020
export TACHYON_WORKER_MEMORY_SIZE=60GB
export TACHYON_UNDERFS_HDFS_IMPL=org.apache.hadoop.hdfs.DistributedFileSystem

CONF_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

export TACHYON_JAVA_OPTS+="
  -Dlog4j.configuration=file:$CONF_DIR/log4j.properties
  -Dtachyon.debug=true
  -Dtachyon.home=/home/hdfs/tachyon
  -Dtachyon.underfs.address=$TACHYON_UNDERFS_ADDRESS
  -Dtachyon.underfs.hdfs.impl=$TACHYON_UNDERFS_HDFS_IMPL
  -Dtachyon.data.folder=$TACHYON_UNDERFS_ADDRESS/tmp/tachyon/data
  -Dtachyon.workers.folder=$TACHYON_UNDERFS_ADDRESS/tmp/tachyon/workers
  -Dtachyon.worker.memory.size=$TACHYON_WORKER_MEMORY_SIZE
  -Dtachyon.worker.data.folder=$TACHYON_RAM_FOLDER/tachyonworker/
  -Dtachyon.master.hostname=$TACHYON_MASTER_ADDRESS
  -Dtachyon.master.journal.folder=$TACHYON_HOME/journal/
  -Dtachyon.master.pinlist=/pinfiles;/pindata
  -Dorg.apache.jasper.compiler.disablejsr199=true
"



2014년 7월 17일 목요일 오후 3시 11분 3초 UTC+9, Calvin Jia 님의 말:

Daniel

lukematon,
17.7.2014 klo 3.41.1717.7.2014
vastaanottaja tachyo...@googlegroups.com
Also attach hbase error log

I am really wondering because I already added below to  hbase-site.xml 

<property>
        <name>hbase.rootdir</name>
        <!--value>hdfs://BDB/hbase</value-->
        <value>tachyon://BDB00:19998/hbase</value>
    </property>
    <property>
        <name>fs.tachyon.impl</name>
        <value>tachyon.hadoop.TFS</value>
    </property>





Hbase-hdfs-master-BDB00.log


2014-07-17 16:32:34,654 INFO  [master:BDB00:60000] : FileDoesNotExistException(message:/hbase/data/hbase/meta/1588230740/WALs)/hbase/data/hbase/meta/1588230740/WALs
2014-07-17 16:32:34,654 INFO  [master:BDB00:60000] : File does not exist: tachyon://BDB00:19998/hbase/data/hbase/meta/1588230740/WALs
2014-07-17 16:32:34,654 INFO  [master:BDB00:60000] : mkdirs(tachyon://BDB00:19998/hbase/data/hbase/meta/1588230740/WALs, rwxrwxrwx)
2014-07-17 16:32:34,655 INFO  [master:BDB00:60000] : getFileStatus(tachyon://BDB00:19998/hbase/data/hbase/meta/1588230740/oldWALs): HDFS Path: hdfs://BDB00/hbase/data/hbase/meta/1588230740/oldWALs TPath: tachyon://BDB00:19998/hbase/data/hbase/meta/1588230740/oldWALs
2014-07-17 16:32:34,657 INFO  [master:BDB00:60000] : FileDoesNotExistException(message:/hbase/data/hbase/meta/1588230740/oldWALs)/hbase/data/hbase/meta/1588230740/oldWALs
2014-07-17 16:32:34,657 INFO  [master:BDB00:60000] : File does not exist: tachyon://BDB00:19998/hbase/data/hbase/meta/1588230740/oldWALs
2014-07-17 16:32:34,658 INFO  [master:BDB00:60000] : mkdirs(tachyon://BDB00:19998/hbase/data/hbase/meta/1588230740/oldWALs, rwxrwxrwx)
2014-07-17 16:32:34,659 INFO  [master:BDB00:60000] : getFileStatus(tachyon://BDB00:19998/hbase/data/hbase/meta/1588230740/WALs/hlog.1405582354658): HDFS Path: hdfs://BDB00/hbase/data/hbase/meta/1588230740/WALs/hlog.1405582354658 TPath: tachyon://BDB00:19998/hbase/data/hbase/meta/1588230740/WALs/hlog.1405582354658
2014-07-17 16:32:34,661 INFO  [master:BDB00:60000] : FileDoesNotExistException(message:/hbase/data/hbase/meta/1588230740/WALs/hlog.1405582354658)/hbase/data/hbase/meta/1588230740/WALs/hlog.1405582354658
2014-07-17 16:32:34,661 INFO  [master:BDB00:60000] : File does not exist: tachyon://BDB00:19998/hbase/data/hbase/meta/1588230740/WALs/hlog.1405582354658
2014-07-17 16:32:34,664 ERROR [master:BDB00:60000] master.MasterFileSystem: bootstrap
java.io.IOException: cannot get log writer
    at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:197)
    at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWALWriter(HLogFactory.java:177)
    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.createWriterInstance(FSHLog.java:620)
    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:546)
    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:503)
    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.<init>(FSHLog.java:418)
    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.<init>(FSHLog.java:291)
    at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createHLog(HLogFactory.java:45)
    at org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:4313)
    at org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:4280)
    at org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:4253)
    at org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:4331)
    at org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:4211)
    at org.apache.hadoop.hbase.master.MasterFileSystem.bootstrap(MasterFileSystem.java:528)
    at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:479)
    at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:147)
    at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:128)
    at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:802)
    at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: createNonRecursive unsupported for this filesystem class tachyon.hadoop.TFS
    at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1135)
    at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1110)
    at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1086)
    at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:78)
    at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:194)
    ... 19 more
2014-07-17 16:32:34,666 FATAL [master:BDB00:60000] master.HMaster: Unhandled exception. Starting shutdown.
java.io.IOException: cannot get log writer
    at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:197)
    at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWALWriter(HLogFactory.java:177)
    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.createWriterInstance(FSHLog.java:620)
    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:546)
    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:503)
    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.<init>(FSHLog.java:418)
    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.<init>(FSHLog.java:291)
    at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createHLog(HLogFactory.java:45)
    at org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:4313)
    at org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:4280)
    at org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:4253)
    at org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:4331)
    at org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:4211)
    at org.apache.hadoop.hbase.master.MasterFileSystem.bootstrap(MasterFileSystem.java:528)
    at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:479)
    at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:147)
    at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:128)
    at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:802)
    at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: createNonRecursive unsupported for this filesystem class tachyon.hadoop.TFS
    at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1135)
    at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1110)
    at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1086)
    at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:78)
    at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:194)
    ... 19 more
2014-07-17 16:32:34,667 INFO  [master:BDB00:60000] master.HMaster: Aborting
2014-07-17 16:32:34,667 DEBUG [master:BDB00:60000] master.HMaster: Stopping service threads
2014-07-17 16:32:34,667 INFO  [master:BDB00:60000] ipc.RpcServer: Stopping server on 60000
2014-07-17 16:32:34,667 INFO  [RpcServer.listener,port=60000] ipc.RpcServer: RpcServer.listener,port=60000: stopping
2014-07-17 16:32:34,669 INFO  [master:BDB00:60000] master.HMaster: Stopping infoServer
2014-07-17 16:32:34,669 INFO  [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopped
2014-07-17 16:32:34,669 INFO  [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopping
2014-07-17 16:32:34,671 INFO  [master:BDB00:60000] mortbay.log: Stopped SelectChann...@0.0.0.0:60010
2014-07-17 16:32:34,800 INFO  [master:BDB00:60000] zookeeper.ZooKeeper: Session: 0x34738965ca00060 closed
2014-07-17 16:32:34,800 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2014-07-17 16:32:34,800 INFO  [master:BDB00:60000] master.HMaster: HMaster main thread exiting
2014-07-17 16:32:34,800 ERROR [main] master.HMasterCommandLine: Master exiting





2014년 7월 17일 목요일 오후 4시 14분 28초 UTC+9, Daniel 님의 말:

Daniel

lukematon,
17.7.2014 klo 5.13.3217.7.2014
vastaanottaja tachyo...@googlegroups.com
Thank you for help Calvin.

I found root cause.

TFS.java does not have createNonRecursive method.

@Haoyuan Li Do you have a plan for adding this method? 

There are no implementation on 0.5.0 and master branch. 

Or any manner for avoiding this issue?



2014년 7월 16일 수요일 오후 8시 18분 16초 UTC+9, Daniel 님의 말:

Haoyuan Li

lukematon,
21.7.2014 klo 4.23.1821.7.2014
vastaanottaja Daniel, tachyo...@googlegroups.com
Hi Daniel,

Thanks for reporting this. Yes, it will be great to implement this method. Do you mind filling a JIRA for this? https://spark-project.atlassian.net/browse/TACHYON

Thanks,

Haoyuan


--
You received this message because you are subscribed to the Google Groups "Tachyon Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tachyon-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Haoyuan Li
AMPLab, EECS, UC Berkeley

Daniel

lukematon,
21.7.2014 klo 5.09.3921.7.2014
vastaanottaja tachyo...@googlegroups.com, sofa...@gmail.com
Ok, I will.

Could you please give me an authority ?



2014년 7월 21일 월요일 오후 5시 23분 18초 UTC+9, Haoyuan Li 님의 말:

Daniel

lukematon,
21.7.2014 klo 22.26.5421.7.2014
vastaanottaja tachyo...@googlegroups.com
I have added a function.

Hbase works fine. 

Hadoop 2.4.0 + Hbase 0.98.3 + zookeeper 3.4.6 + Tachyon 0.4.1-thrift 

Everything is apache version.  

Thanks everyone, and I will add this issue to jira


2014년 7월 16일 수요일 오후 8시 18분 16초 UTC+9, Daniel 님의 말:
Hello, 
Vastaa kaikille
Vastaa kirjoittajalle
Välitä
0 uutta viestiä