Read ORC FIle Error

Syed Akram

unread,

Nov 30, 2015, 2:18:22 AM11/30/15

to Presto

I am trying to read a orc file, and i faced the below issue,

java.lang.IndexOutOfBoundsException: end index (2775362) must not be greater than size (1696)

at io.airlift.slice.Preconditions.checkPositionIndexes(Preconditions.java:94)

at io.airlift.slice.Slice.checkIndexLength(Slice.java:1185)

at io.airlift.slice.Slice.slice(Slice.java:739)

at io.airlift.slice.BasicSliceInput.readSlice(BasicSliceInput.java:156)

at com.facebook.presto.orc.stream.OrcInputStream.advance(OrcInputStream.java:210)

at com.facebook.presto.orc.stream.OrcInputStream.read(OrcInputStream.java:125)

at java.io.InputStream.read(InputStream.java:101)

at com.facebook.presto.hive.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:737)

at com.facebook.presto.hive.protobuf.CodedInputStream.isAtEnd(CodedInputStream.java:701)

at com.facebook.presto.hive.protobuf.CodedInputStream.readTag(CodedInputStream.java:99)

at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeFooter.<init>(OrcProto.java:10661)

at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeFooter.<init>(OrcProto.java:10625)

at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeFooter$1.parsePartialFrom(OrcProto.java:10730)

at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeFooter$1.parsePartialFrom(OrcProto.java:10725)

at com.facebook.presto.hive.protobuf.AbstractParser.parseFrom(AbstractParser.java:89)

at com.facebook.presto.hive.protobuf.AbstractParser.parseFrom(AbstractParser.java:95)

at com.facebook.presto.hive.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)

at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeFooter.parseFrom(OrcProto.java:10958)

at com.facebook.presto.orc.metadata.OrcMetadataReader.readStripeFooter(OrcMetadataReader.java:112)

at com.facebook.presto.orc.StripeReader.readStripeFooter(StripeReader.java:321)

at com.facebook.presto.orc.StripeReader.readStripe(StripeReader.java:100)

at com.facebook.presto.orc.OrcRecordReader.advanceToNextStripe(OrcRecordReader.java:355)

at com.facebook.presto.orc.OrcRecordReader.advanceToNextRowGroup(OrcRecordReader.java:317)

at com.facebook.presto.orc.OrcRecordReader.nextBatch(OrcRecordReader.java:281)

any fix?

Thanks

Dain Sundstrom

unread,

Nov 30, 2015, 6:00:11 PM11/30/15

to presto...@googlegroups.com

The stack shows that the code is attempting to read a new stripe, and the stripe size in the file is larger than the bytes in the file. To me it looks like you have a corrupt ORC file. How did you write this file?

-dain

> --
> You received this message because you are subscribed to the Google Groups "Presto" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to presto-users...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Syed Akram

unread,

Nov 30, 2015, 10:47:34 PM11/30/15

to Presto

I wrote the file using Hive orcwriter.

https://gist.github.com/dain/e931a43b3463136fd7bf

i used above code to read the orc file, as you said, this file have multiple stripes, but i am not able to read that using above code,

can you suggest any changes?

Thanks

Syed Akram

unread,

Nov 30, 2015, 10:57:10 PM11/30/15

to Presto

FYI, i tried small file reading using orc reader, it works fine.

Dain Sundstrom

unread,

Dec 1, 2015, 9:04:25 PM12/1/15

to presto...@googlegroups.com

Can you post the code for the writer? IIRC you get this kind of error if you don’t flush the writer.

-dain

Syed Akram

unread,

Dec 1, 2015, 10:38:42 PM12/1/15

to Presto

OrcStructInspector orcStructInsp = createObjectInspector(structFieldNames, fieldObject);

ObjectInspector objInsp = (ObjectInspector) orcStructInsp;

List<? extends StructField> fieldRef = orcStructInsp.getAllStructFieldRefs();

setConf(hdfsIp, hdfsPort);

writer = OrcFile.createWriter(fs, new Path(finalPath),

conf, objInsp, 67108864, CompressionKind.SNAPPY, 4096 , 10000);

int fieldSize = fieldRef.size();

ObjectInspector[] objArray = new ObjectInspector[fieldSize];

for (int i = 0; i < fieldSize; i++)

{

objArray[i] = fieldRef.get(i).getFieldObjectInspector();

}

row.setFieldValue(i, Writables.getWritable(objArray[i], res));

writer.addRow(row);

}

finally

{

try

{

if (writer != null)

{

writer.close();

}

On Monday, November 30, 2015 at 12:48:22 PM UTC+5:30, Syed Akram wrote:

Reply all

Reply to author

Forward