Getting java.io.FileNotFoundException after configuring HDFS as the underFS

549 views
Skip to first unread message

Jais Sebastian

unread,
Feb 22, 2017, 4:26:21 AM2/22/17
to Alluxio Users
Hi ,

I am getting below error from the Spark job, while writing a parquet file into the Alluxio. This happens only after I configured HDFS as the under-storage 

6549a1c1-4af3-4355-9a3a-a2b766c68b5f 2017-02-22 04:20:49.296  WARN 6 --- [nio-8082-exec-1] alluxio.logger.type                       361 : Failed to write into AlluxioStore, canceling write attempt.

java.io.FileNotFoundException: /mnt/ramdisk/alluxioworker/.tmp_blocks/1011/5fd7ca4c78732bf3-700007a (No such file or directory)
at java.io.RandomAccessFile.open0(Native Method)
at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:124)
at alluxio.worker.block.io.LocalFileBlockWriter.<init>(LocalFileBlockWriter.java:46)
at alluxio.client.block.LocalBlockOutStream.<init>(LocalBlockOutStream.java:72)
at alluxio.client.block.AlluxioBlockStore.getOutStream(AlluxioBlockStore.java:189)
at alluxio.client.file.FileOutStream.getNextBlock(FileOutStream.java:336)
at alluxio.client.file.FileOutStream.write(FileOutStream.java:301)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
at java.io.DataOutputStream.write(DataOutputStream.java:107)

I added below JAVA options for alluxio to configure underFS
-Dalluxio.underfs.address=hdfs://<name server>  -Dalluxio.user.file.writetype.default=CACHE_THROUGH

Regards,
Jais


Pei Sun

unread,
Feb 22, 2017, 1:15:40 PM2/22/17
to Jais Sebastian, Alluxio Users
Hi Jais,
    This can happen either because the worker failed to create the temporary file for some reason (e.g. not enough ram) or the temporary file has been deleted because the client hung for too long. Can you check whether the file (/mnt/ramdisk/alluxioworker/.tmp_blocks/1011/5fd7ca4c78732bf3-700007a) exists manually just to confirm?  And send the logs (client log, worker log, master log)?

Thanks
Pei

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Pei Sun

Jais Sebastian

unread,
Feb 23, 2017, 8:13:34 AM2/23/17
to Alluxio Users, jais...@gmail.com
Hi Pie,
I have enough memory and this started showing on after I configured underFs. I am running in Mesos cluster mode along with Spark. I dont see any error in Alluxio worker/master node . This error comes in Spark driver log only. Attached the log file.
I dont see the block file mentioned in the error log in any of the worker node,

Note that I havened specified any additional configuration other than Under FS.

Regards,
Jais
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Pei Sun
stdout-latest

Pei Sun

unread,
Feb 27, 2017, 2:01:40 PM2/27/17
to Jais Sebastian, Alluxio Users
Hi Jais,
    Which Alluxio and Spark version are you using? 

Pei

To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Pei Sun

Jais Sebastian

unread,
Feb 28, 2017, 1:38:28 PM2/28/17
to Alluxio Users, jais...@gmail.com
Hi Pei,

We are using Spark 2.1 and Alluxio 1.3

Regards,
Jais

Pei Sun

unread,
Feb 28, 2017, 9:52:41 PM2/28/17
to Jais Sebastian, Alluxio Users
Hi Jais,
   1. Are you using containers (e.g. docker) to run Alluxio and Spark jobs? If you are, you need to follow this to make sure that both client and the alluxio worker have access to the ramdisk.
   2.  If 1 does not solve the problem, can you make sure that the hdfs doesn't override the tmp file in /mnt/ramdisk?
   3.  Can you send me the worker log? If no one deletes the temporary file and the client can access the ramdisk, the only thing i can think of now is that the client is considered as lost from the worker. The worker log might have some warning msg.

Pei

To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Pei Sun

Jais Sebastian

unread,
Mar 1, 2017, 1:57:18 PM3/1/17
to Pei Sun, Alluxio Users
Hi Pei,
I am running both Alluxio and Spark from Mesos. Both spark and  Alluxio  are running as Mesos framework. Alluxio is launched using Alluxio-Mesos-framework.sh. And spark is running in client mode with specified number of executors. I assume both of them share same RAM disk. I haven't changed any default configuration for Alluxio, only changed CAHE_THROUGH and HDFS remote to true. 

HDFS is installed remotely in different hosts and they are not co located. 

Note that if i am not configuring HDFS , I don't see any error. !!

I don't see any error logs which says worker is lost or any error from Allluxio. My last mail had attachment with all logs. 

Regards,
Jais

Sent from my iPhone

Pei Sun

unread,
Mar 1, 2017, 2:18:47 PM3/1/17
to Jais Sebastian, Alluxio Users
Hi Jais,
    Your last attached logs are only for the client side. It might be useful to see the logs on the Alluxio worker/master side as well. This looks strange to me. I don't really have any new guesses of what happened other than what i mentioned in my last email. I might be able to find something out in the worker/master logs if you can send them to me.
     
    What I suggest is that you can open a bug in our JIRA and document the way to reproduce this.  I can reproduce it and debug what happened.

Pei
--
Pei Sun

Jais Sebastian

unread,
Mar 2, 2017, 12:47:09 PM3/2/17
to Alluxio Users, jais...@gmail.com
Hi Pei,

Attached the Alluxio logs for your reference.

Regards,
Jais
Pei Sun
alluxio_logs.zip

Pei Sun

unread,
May 10, 2017, 5:04:44 PM5/10/17
to Jais Sebastian, Alluxio Users
Hi Jais,
    Have you figured out how to fix this? I was not able to reproduce it. Do you think you can tell me a quick way to reproduce the problem so that I can debug it?

Thank you
Pei

To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Pei Sun

Pei Sun

unread,
Jun 8, 2017, 5:38:43 PM6/8/17
to Jais Sebastian, Alluxio Users
Hi Jais
   How did you launch your docker container? Did you pass something like "-v /mnt/ramdisk:/mnt/ramdisk" to share your ramdisk from the host machine with the clients and worker? If you do not do that, I think it might throw exceptions like what you posted.

Pei

Hi Pei,

Thanks for checking on this.

We havent solved this issue yet. Not sure how can i share the details.


Here are the steps we do
1. Create a Alluxio docker image - refer the docker file from docker.zip
2. Deploy the Alluxio in Mesos using marahon below command and configuration
      - Command : /alluxio/integration/mesos/bin/alluxio-mesos-start.sh  -w leader.mesos:5050 master -DHADOOP_USER_NAME=hdfs
      - Environment Variable:
             "ALLUXIO_MASTER_HOSTNAME": "master",
    "ALLUXIO_JAVA_OPTS": "-Dalluxio.integration.worker.resource.mem=1048MB -Dalluxio.worker.memory.size=1048MB  -Dalluxio.integration.mesos.alluxio.jar.url=http://downloads.alluxio.org/downloads/files/1.4.0/alluxio-1.4.0-bin.tar.gz -Dalluxio.user.file.writetype.default=CACHE_THROUGH -Dalluxio.underfs.address=hdfs://hdfs:8020/tmp"
 
3.  In the spark driver also we set the configure below  Java options 
     > fs.alluxio.impl>alluxio.hadoop.FileSystem
     >spark.driver.extraJavaOptions=-Dalluxio.integration.worker.resource.mem=1048MB -Dalluxio.worker.memory.size=1048MB  -Dalluxio.integration.mesos.alluxio.jar.url=http://downloads.alluxio.org/downloads/files/1.4.0/alluxio-1.4.0-bin.tar.gz -Dalluxio.user.file.writetype.default=CACHE_THROUGH -Dalluxio.underfs.address=hdfs://hdfs:8020/tmp
      >spark.executor.extraJavaOptions=-Dalluxio.integration.worker.resource.mem=1048MB -Dalluxio.worker.memory.size=1048MB  -Dalluxio.integration.mesos.alluxio.jar.url=http://downloads.alluxio.org/downloads/files/1.4.0/alluxio-1.4.0-bin.tar.gz -Dalluxio.user.file.writetype.default=CACHE_THROUGH -Dalluxio.underfs.address=hdfs://hdfs:8020/tmp"


Can you
   1. Verify our docker image and tell us the configurations are good or do we need different setting.
   2. We see some error message related to Journal path etc .. Attached the logs [ logs from master]
   3. Also attached the exact configuration values [ Configuration values.txt]

Our requirement is  Alluxio should be deployed in Mesos. UnderFS as HDFS for protecting data incase Alluxio crashes.


Please review and let me know your feedback.

Regards,
Jais
--
Regards,
Jais Sebastian
+919980722994

"There are only 10 types of people in the world: Those who understand binary, and those who don't."

(¨`•.•´¨) Keep
`•.¸(¨`•.•´¨) Smiling
      (¨`•.•´¨)¸.•´ Always
       `•.¸.•´



--
Pei Sun
Reply all
Reply to author
Forward
0 new messages