Druid batch loader job failed

185 views
Skip to first unread message

Hong Wang

unread,
Mar 14, 2017, 4:34:54 PM3/14/17
to Druid User
Hi,

Yesterday, we have a hanging real-time ingestion job, I have to submit a task to kill the real-time ingestion. Then when I try to batch load the data from Cassandra to Druid via AWS S3 bucket. I am getting 'No input paths specified' error. My batch load job failed. I checked in my AWS S3 bucket, it is true that I do not see the folder exist. Any idea why I am getting this error? I know we have data in Cassandra that needs to be loaded to Druid.

I also notices that after i killed the realtime ingestion, my data source 'ABC_Company' showed as 'Disabled' in coordinator console. But I am not able to re-enable it.

Disabled Datasources:

  • ABC_Company

---batch job log ---
2017-03-14T19:48:20,028 INFO [task-runner-0-priority-0] io.druid.indexer.path.GranularityPathSpec - Checking path[s3n://fs-bigdata-test/batch_jobs/2017_03_14_19_03_09/ABC_Company/druid/2017/01]
2017-03-14T19:48:20,917 INFO [task-runner-0-priority-0] io.druid.indexer.path.GranularityPathSpec - Checking path[s3n://fs-bigdata-test/batch_jobs/2017_03_14_19_03_09/ABC_Company/druid/2017/02]
2017-03-14T19:48:21,154 INFO [task-runner-0-priority-0] io.druid.indexer.path.GranularityPathSpec - Checking path[s3n://fs-bigdata-test/batch_jobs/2017_03_14_19_03_09/ABC_Company/druid/2017/03]
2017-03-14T19:48:21,388 INFO [task-runner-0-priority-0] io.druid.indexer.path.GranularityPathSpec - Appending path []
2017-03-14T19:48:21,427 INFO [task-runner-0-priority-0] io.druid.indexer.path.GranularityPathSpec - Checking path[s3n://fs-bigdata-test/batch_jobs/2017_03_14_19_03_09/ABC_Company/druid/2017/01]
2017-03-14T19:48:21,661 INFO [task-runner-0-priority-0] io.druid.indexer.path.GranularityPathSpec - Checking path[s3n://fs-bigdata-test/batch_jobs/2017_03_14_19_03_09/ABC_Company/druid/2017/02]
2017-03-14T19:48:21,996 INFO [task-runner-0-priority-0] io.druid.indexer.path.GranularityPathSpec - Checking path[s3n://fs-bigdata-test/batch_jobs/2017_03_14_19_03_09/ABC_Company/druid/2017/03]
2017-03-14T19:48:22,228 INFO [task-runner-0-priority-0] io.druid.indexer.path.GranularityPathSpec - Appending path []
2017-03-14T19:48:22,282 INFO [task-runner-0-priority-0] org.apache.hadoop.conf.Configuration.deprecation - session.id is deprecated. Instead, use dfs.metrics.session-id
2017-03-14T19:48:22,282 INFO [task-runner-0-priority-0] org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
2017-03-14T19:48:22,306 WARN [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobSubmitter - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2017-03-14T19:48:22,309 WARN [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobSubmitter - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2017-03-14T19:48:22,314 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.JobSubmitter - Cleaning up the staging area file:/tmp/hadoop-druid/mapred/staging/druid1711207211/.staging/job_local1711207211_0001
2017-03-14T19:48:22,315 WARN [task-runner-0-priority-0] org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:druid (auth:SIMPLE) cause:java.io.IOException: No input paths specified in job
2017-03-14T19:48:22,317 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_ABC_Company_2017-03-14T19:48:13.670Z, type=index_hadoop, dataSource=ABC_Company}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:204) ~[druid-indexing-service-0.9.2.jar:0.9.2]
	at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:175) ~[druid-indexing-service-0.9.2.jar:0.9.2]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.9.2.jar:0.9.2]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.9.2.jar:0.9.2]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_45]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_45]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_45]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_45]
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_45]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_45]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_45]
	at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_45]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.2.jar:0.9.2]
	... 7 more
Caused by: java.lang.RuntimeException: java.io.IOException: No input paths specified in job
	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
	at io.druid.indexer.DetermineHashedPartitionsJob.run(DetermineHashedPartitionsJob.java:208) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:349) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopDruidDetermineConfigurationJob.run(HadoopDruidDetermineConfigurationJob.java:91) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopDetermineConfigInnerProcessing.runTask(HadoopIndexTask.java:291) ~[druid-indexing-service-0.9.2.jar:0.9.2]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_45]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_45]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_45]
	at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_45]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.2.jar:0.9.2]
	... 7 more
Caused by: java.io.IOException: No input paths specified in job
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231) ~[?:?]
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340) ~[?:?]
	at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:493) ~[?:?]
	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510) ~[?:?]
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394) ~[?:?]
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) ~[?:?]
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) ~[?:?]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_45]
	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_45]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) ~[?:?]
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) ~[?:?]
	at io.druid.indexer.DetermineHashedPartitionsJob.run(DetermineHashedPartitionsJob.java:116) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:349) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexer.HadoopDruidDetermineConfigurationJob.run(HadoopDruidDetermineConfigurationJob.java:91) ~[druid-indexing-hadoop-0.9.2.jar:0.9.2]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopDetermineConfigInnerProcessing.runTask(HadoopIndexTask.java:291) ~[druid-indexing-service-0.9.2.jar:0.9.2]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_45]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_45]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_45]
	at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_45]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.2.jar:0.9.2]
	... 7 more
2017-03-14T19:48:22,329 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_ABC_Company_2017-03-14T19:48:13.670Z] status changed to [FAILED].
2017-03-14T19:48:22,331 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_hadoop_ABC_Company_2017-03-14T19:48:13.670Z",
  "status" : "FAILED",
  "duration" : 4911
}

Thanks

Hong

Hong Wang

unread,
Mar 20, 2017, 5:36:47 PM3/20/17
to Druid User
This issue is resolved after change the batch loader date range.

Thanks
Reply all
Reply to author
Forward
0 new messages