Error when reading avro file using ravro

92 views
Skip to first unread message

kumar deepak

unread,
Sep 1, 2015, 2:43:14 AM9/1/15
to RHadoop
I was trying RHadoop and specifically RAvro. Is there a tutorial about using different components of RHadoop. It will be very helpful.

I tried reading a valid avro file
read.avro("/TREEJOIN/avro/6dfc4e2cae3d8708e2216a0f/dw/part-r-00000.avro")

and got this error
Error in avro_get_schema(file) : 
  Error retrieving schema.  Verify that the file exists and is a valid Avro: /TREEJOIN/avro/6dfc4e2cae3d8708e2216a0f/dw/part-r-00000.avro

Any guidance will be appreciated.

I saw similar issues in previous threads, but no solution is found yet.

kumar deepak

unread,
Sep 1, 2015, 8:06:07 AM9/1/15
to RHadoop
Read.avro function is using local filesystem by default. When the file is on local filesystem, read works. I could not figure out a way to specify hdfs path.
I tried full hdfs path (hdfs://...:8020/...) and also tried read.avro(hdfs.file(filepath)) but nothing works.

Why don't we create  a simple getting started guide, which will save so much time. I have not found a single tutorial for this. I can do it if you guys suggest few starting points.

Thanks,
Kr Deepak
Reply all
Reply to author
Forward
0 new messages