ingest data from hadoop to druid failed

1,198 views
Skip to first unread message

Gary Wu

unread,
Apr 25, 2016, 3:03:20 AM4/25/16
to Druid User
Hi 
I am using druid0.9.0 cluster (3 nodes)  hadoop 2.3.0 and mysql,  when I ingest the data from hadoop to druid using overload (indexing node), there is a error print and the task always runs failed.

The related files are in attachment.
 the task json file is : wikiticker-index.json
 the common config is : common.runtime.properties
 the task error is:   index_hadoop_wikiticker_2016-04-25T06_26_59.682Z.txt 

I can see the map-reduce task run successfully.   Could you find any configuration error?   

Thank you 





-------------------error log in ---------------------
2016-04-25T06:27:58,446 INFO [task-runner-0-priority-0] io.druid.indexing.common.task.HadoopIndexTask - Starting a hadoop index generator job...
2016-04-25T06:27:58,466 INFO [task-runner-0-priority-0] io.druid.indexer.path.StaticPathSpec - Adding paths[/test/my-sample.json]
2016-04-25T06:27:58,469 INFO [task-runner-0-priority-0] io.druid.indexer.HadoopDruidIndexerJob - No metadataStorageUpdaterJob set in the config. This is cool if you are running a hadoop index task, otherwise nothing will be uploaded to database.
2016-04-25T06:27:58,494 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_wikiticker_2016-04-25T06:26:59.682Z, type=index_hadoop, dataSource=wikiticker}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:160) ~[druid-indexing-service-0.9.0.jar:0.9.0]
at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:208) ~[druid-indexing-service-0.9.0.jar:0.9.0]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:338) [druid-indexing-service-0.9.0.jar:0.9.0]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:318) [druid-indexing-service-0.9.0.jar:0.9.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_79]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_79]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_79]
at java.lang.Thread.run(Thread.java:745) [?:1.7.0_79]
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_79]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_79]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_79]
at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_79]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:157) ~[druid-indexing-service-0.9.0.jar:0.9.0]
... 7 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: No buckets?? seems there is no data to index.
at io.druid.indexer.IndexGeneratorJob.run(IndexGeneratorJob.java:211) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
at io.druid.indexer.JobHelper.runJobs(JobHelper.java:323) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:94) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:261) ~[druid-indexing-service-0.9.0.jar:0.9.0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_79]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_79]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_79]
at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_79]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:157) ~[druid-indexing-service-0.9.0.jar:0.9.0]
... 7 more
Caused by: java.lang.RuntimeException: No buckets?? seems there is no data to index.
at io.druid.indexer.IndexGeneratorJob.run(IndexGeneratorJob.java:172) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
at io.druid.indexer.JobHelper.runJobs(JobHelper.java:323) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:94) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:261) ~[druid-indexing-service-0.9.0.jar:0.9.0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_79]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_79]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_79]
at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_79]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:157) ~[druid-indexing-service-0.9.0.jar:0.9.0]
... 7 more
2016-04-25T06:27:58,512 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_hadoop_wikiticker_2016-04-25T06:26:59.682Z",
  "status" : "FAILED",
  "duration" : 51174
}
2016-04-25T06:27:58,520 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.coordination.AbstractDataSegmentAnnouncer.stop()] on object[io.druid.server.coordination.BatchDataSegmentAnnouncer@5d7a4ab].

-----------------------------
index_hadoop_wikiticker_2016-04-25T06_26_59.682Z.txt
common.runtime.properties
wikiticker-index.json

zhangxin...@gmail.com

unread,
Apr 25, 2016, 8:37:01 AM4/25/16
to Druid User
I suffered this problem today,too.  I used imply-1.2.0 cluster with 4 nodes(1 master node,2 data nodes, 1 query node) and ran always failed...

在 2016年4月25日星期一 UTC+8下午3:03:20,Gary Wu写道:

Fangjin Yang

unread,
Apr 26, 2016, 9:03:57 PM4/26/16
to Druid User
This error means the data you are trying to index does not match the "intervals" object you provided. The easiest fix is to make sure you are running on UTC timezone everywhere and your data is UTC timezone as well.

Gary Wu

unread,
Apr 27, 2016, 2:33:06 AM4/27/16
to Druid User

Hi Fangjin,
Yes, it works for my environment. I change the cluster timezone to UTC, and I also adjust the time interval. It can ingest the data successfully,
Thanks a lot.

kl...@sertiscorp.com

unread,
Jun 14, 2017, 3:45:20 AM6/14/17
to Druid User

Hi Gary,

Could you please share me how do you change cluster timezone. My cluster (Unix date) is already in UTC. Also how to adjust the time interval. I experience the same problem as yours. I can't run quickstart code so I don't know whether my cluster is working or not.

Best Regards,
Kamolphan
Reply all
Reply to author
Forward
0 new messages