parquet ingestion

122 views
Skip to first unread message

Rishi Mishra

unread,
Sep 29, 2016, 7:15:46 AM9/29/16
to Druid User

Getting the following error when I am trying to hadoop ingest some parquet data. (I am on druid 0.9.1.1)

Error: com.metamx.common.RE: Failure on row[PAR1 � � , 8292220022772827096 8072220040034903382 -��]
at io.druid.indexer.HadoopDruidIndexerMapper.map(HadoopDruidIndexerMapper.java:88)
at io.druid.indexer.DetermineHashedPartitionsJob$DetermineCardinalityMapper.run(DetermineHashedPartitionsJob.java:283)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.avro.generic.GenericRecord
at io.druid.data.input.parquet.ParquetHadoopInputRowParser.parse(ParquetHadoopInputRowParser.java:37)
at io.druid.indexer.HadoopDruidIndexerMapper.parseInputRow(HadoopDruidIndexerMapper.java:102)
at io.druid.indexer.HadoopDruidIndexerMapper.map(HadoopDruidIndexerMapper.java:69)
... 8 more

Rishi Mishra

unread,
Sep 29, 2016, 9:31:17 AM9/29/16
to Druid User
I was using a "multi" inputSpec with 2 children when I get this error. When I ingest a child individually it gets ingested but when I provide both the children together I get this error. 

Kartik Tripathi

unread,
Mar 19, 2018, 11:12:49 AM3/19/18
to Druid User
Any updates in this? Even I am facing this issue. Individually ingesting each child or using input 'type' as 'static' and specifying all paths give correct ingestion but multi fails


On Thursday, September 29, 2016 at 4:45:46 PM UTC+5:30, Rishi Mishra wrote:
Reply all
Reply to author
Forward
0 new messages