Hey I am trying to get a simple record count of an Avro file and it seems to be very difficult. My query is:
DROP TABLE IF EXISTS QCB_DZD_2;
CREATE EXTERNAL TABLE QCB_DZD_2(one STRING)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED as INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 's3://p3-segment-data/profile/hh/10000/Profile_1000.avro';
SELECT COUNT(*) FROM QCB_DZD_2;
This seems to fail with:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
I am wondering if this is because this uses 'snappy' codec, and I'm not sure how I would solve that. Any help would be great. Thanks