parquet ingestion

Rishi Mishra

unread,

Sep 29, 2016, 7:15:46 AM9/29/16

to Druid User

Getting the following error when I am trying to hadoop ingest some parquet data. (I am on druid 0.9.1.1)

Error: com.metamx.common.RE: Failure on row[PAR1 � � , 8292220022772827096 8072220040034903382 -��]

at io.druid.indexer.HadoopDruidIndexerMapper.map(HadoopDruidIndexerMapper.java:88)

at io.druid.indexer.DetermineHashedPartitionsJob$DetermineCardinalityMapper.run(DetermineHashedPartitionsJob.java:283)

at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)

at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.avro.generic.GenericRecord

at io.druid.data.input.parquet.ParquetHadoopInputRowParser.parse(ParquetHadoopInputRowParser.java:37)

at io.druid.indexer.HadoopDruidIndexerMapper.parseInputRow(HadoopDruidIndexerMapper.java:102)

at io.druid.indexer.HadoopDruidIndexerMapper.map(HadoopDruidIndexerMapper.java:69)

... 8 more

Rishi Mishra

unread,

Sep 29, 2016, 9:31:17 AM9/29/16

to Druid User

I was using a "multi" inputSpec with 2 children when I get this error. When I ingest a child individually it gets ingested but when I provide both the children together I get this error.

Kartik Tripathi

unread,

Mar 19, 2018, 11:12:49 AM3/19/18

to Druid User

Any updates in this? Even I am facing this issue. Individually ingesting each child or using input 'type' as 'static' and specifying all paths give correct ingestion but multi fails

On Thursday, September 29, 2016 at 4:45:46 PM UTC+5:30, Rishi Mishra wrote:

Reply all

Reply to author

Forward