Problem connecting to a Tachyon Node

6 views
Skip to first unread message

Alberto Maria Angelo Paro

unread,
May 31, 2014, 9:22:14 AM5/31/14
to tachyo...@googlegroups.com

Hi,
   I set up a remote tachyon (0.4.1 bin release) node on a server. I done all the checks: the services (master/slave) listen on the ip. From local node, everything works ok.

I was unable to connect to the node via a simple Test. I write the code in scala, but it must be similar to te java one:

      val client= TachyonFS.get("tachyon://192.168.1.5:19998")
      client.connect()


On connect I've an error.
[31-May 14:11:07:570] INFO  [specs2.DefaultExecutionStrategy-1       ] .getUserId - User registered at the master andreino/192.168.1.5:19998 got UserId 2
[31-May 14:11:09:611] INFO  [specs2.DefaultExecutionStrategy-1       ] .connect - Trying to get local worker host : 192.168.1.17
[31-May 14:11:09:909] INFO  [specs2.DefaultExecutionStrategy-1       ] .connect - No local worker on 192.168.1.17
[31-May 14:11:10:204] INFO  [specs2.DefaultExecutionStrategy-1       ] .connect - Connecting remote worker @ andreino/192.168.1.5:29998
[31-May 14:11:10:209] ERROR [specs2.DefaultExecutionStrategy-1       ] .open - java.net.ConnectException: Connection refused
tachyon.org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at tachyon.org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at tachyon.org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at tachyon.worker.WorkerClient.open(WorkerClient.java:137)
at tachyon.client.TachyonFS.connect(TachyonFS.java:256)
at tnp.storage.TachyonSpec$$anonfun$1$$anonfun$apply$5.apply(TachyonSpec.scala:22)
at tnp.storage.TachyonSpec$$anonfun$1$$anonfun$apply$5.apply(TachyonSpec.scala:18)
...

As you can see, I was able to connect to the master, to obtain the worker. The connection fails on thrift connection.

I tryed both using the maven entry:
     "org.tachyonproject"         % "tachyon"          % "0.4.1"
and taken from spark:
    "org.tachyonproject"         % "tachyon"          % "0.4.1-thrift"

Is the local node required for storing files on tachyon or for "navigating" it?

I think that if the thrift protocol is used, It should work as remote datastore

Do you have any hints?

Best regards,
  Alberto Paro

Haoyuan Li

unread,
May 31, 2014, 12:45:39 PM5/31/14
to Alberto Maria Angelo Paro, tachyo...@googlegroups.com
Hi Alberto Paro,

You should be able to create/read/write/delete file without calling "client.connect()".


Thanks,

Haoyuan


--
You received this message because you are subscribed to the Google Groups "Tachyon Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tachyon-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Haoyuan Li
AMPLab, EECS, UC Berkeley

Alberto Maria Angelo Paro

unread,
Jun 4, 2014, 9:13:11 AM6/4/14
to Haoyuan Li, tachyo...@googlegroups.com
Hi Haoyuan,
  sorry to bother you again.
Removing the client worked about the previous error.
But now I've several other issues try to write and reading using only a remote client.

I execute the BasicOperations Example with several write options and they give me errors.

1) with TRY_CACHE
tachyon.examples.BasicOperations tachyon://ucs1.thenetplanet.net:19998 /testPath4 TRY_CACHE
14/06/04 14:52:48 INFO : Trying to connect master @ ucs1.thenetplanet.net/192.168.3.11:19998
14/06/04 14:52:48 INFO : User registered at the master ucs1.thenetplanet.net/192.168.3.11:19998 got UserId 30
14/06/04 14:52:48 INFO : Trying to get local worker host : 192.168.2.52
14/06/04 14:52:48 INFO : No local worker on 192.168.2.52
14/06/04 14:52:48 INFO : Connecting remote worker @ ucs1/192.168.3.11:29998
14/06/04 14:52:48 INFO : createFile with fileId 52 took 165 ms.
14/06/04 14:52:48 WARN : Fail to cache for: The machine does not have any local worker.
Exception in thread "main" java.io.IOException: BlockIndex 0 is out of the bound in file ClientFileInfo(id:52, name:testPath4, path:/testPath4, checkpointPath:, length:0, blockSizeByte:1073741824, creationTimeMs:1401886370876, complete:true, folder:false, inMemory:true, needPin:false, needCache:true, blockIds:[], dependencyId:-1, inMemoryPercentage:100)
at tachyon.client.TachyonFS.getClientBlockInfo(TachyonFS.java:606)
at tachyon.client.TachyonFile.readByteBuffer(TachyonFile.java:207)
at tachyon.client.TachyonFile.readByteBuffer(TachyonFile.java:199)

2) with *THROUGHT write modes:

tachyon.examples.BasicOperations tachyon://ucs1.thenetplanet.net:19998 /testPath5 CACHE_THROUGH
14/06/04 14:58:49 INFO : Trying to connect master @ ucs1.thenetplanet.net/192.168.3.11:19998
14/06/04 14:58:50 INFO : User registered at the master ucs1.thenetplanet.net/192.168.3.11:19998 got UserId 31
14/06/04 14:58:50 INFO : Trying to get local worker host : 192.168.2.52
14/06/04 14:58:50 INFO : No local worker on 192.168.2.52
14/06/04 14:58:50 INFO : Connecting remote worker @ ucs1/192.168.3.11:29998
14/06/04 14:58:50 INFO : createFile with fileId 53 took 293 ms.
14/06/04 14:58:50 WARN : Fail to cache for: The machine does not have any local worker.
Exception in thread "main" java.io.IOException: FailedToCheckpointException(message:Failed to rename /opt/brainaetic/tachyon/tmp/tachyon/workers/1401884000001/31/53 to /opt/brainaetic/tachyon/tmp/tachyon/data/53)
at tachyon.worker.WorkerClient.addCheckpoint(WorkerClient.java:80)
at tachyon.client.TachyonFS.addCheckpoint(TachyonFS.java:165)
at tachyon.client.FileOutStream.close(FileOutStream.java:96)
at tachyon.examples.BasicOperations.writeFile(BasicOperations.java:93)
at tachyon.examples.BasicOperations.main(BasicOperations.java:58)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
Caused by: FailedToCheckpointException(message:Failed to rename /opt/brainaetic/tachyon/tmp/tachyon/workers/1401884000001/31/53 to /opt/brainaetic/tachyon/tmp/tachyon/data/53)
at tachyon.thrift.WorkerService$addCheckpoint_result$addCheckpoint_resultStandardScheme.read(WorkerService.java:3367)
at tachyon.thrift.WorkerService$addCheckpoint_result$addCheckpoint_resultStandardScheme.read(WorkerService.java:3335)

Logs on Master:
2014-06-04 14:58:52,766 INFO  MASTER_LOGGER (MasterInfo.java:getWorker) - getLocalWorker: no local worker on 192.168.2.52
2014-06-04 14:58:52,834 INFO  MASTER_LOGGER (MasterInfo.java:getClientFileInfo) - getClientFileInfo(/testPath5)

logs on Worker:
2014-06-04 14:58:52,801 INFO  WORKER_LOGGER (WorkerStorage.java:getUserTempFolder) - Return UserTempFolder for 31 : /opt/brainaetic/tachyon_ramdisk/tachyonworker/users/31
2014-06-04 14:58:52,811 INFO  WORKER_LOGGER (WorkerStorage.java:getUserUnderfsTempFolder) - Return UserHdfsTempFolder for 31 : /opt/brainaetic/tachyon/tmp/tachyon/workers/1401884000001/31
2014-06-04 14:58:52,888 INFO  WORKER_LOGGER (WorkerStorage.java:getUserUnderfsTempFolder) - Return UserHdfsTempFolder for 31 : /opt/brainaetic/tachyon/tmp/tachyon/workers/1401884000001/31
2014-06-04 14:59:03,120 INFO  WORKER_LOGGER (Users.java:removeUser) - Trying to cleanup user 31 :  The user returns 0 bytes. Remove the user's folder /opt/brainaetic/tachyon_ramdisk/tachyonworker/users/31 ; Also remove users underfs folder /opt/brainaetic/tachyon/tmp/tachyon/workers/1401884000001/31



In case 1) the write seems to work, but it fail on  reading the written data. 2) fails on searching to copy data in dirs in which the user has full permissions. (i tried to reformat the directory tachyon underfs).

Can you give me any hints to resolve the problem?

Best regards,
   Alberto


Haoyuan Li

unread,
Jun 6, 2014, 1:04:06 PM6/6/14
to Alberto Maria Angelo Paro, tachyo...@googlegroups.com
Alberto,

If the client is remote, only TRHOUGH will be succeed. e.g. "CACHE_TRHOUGH", the client will only do "THROUGH" but not "CACHE". The general idea is that no matter read/write, the client will cache the data if there is a worker daemon running on the same node.

Best,

Haoyuan
Reply all
Reply to author
Forward
0 new messages