I am exploring Druid and using
imply.io package for it.
I am able to post wikiticker data which comes with
imply.io and able to see the data in pivot as expected.
After that , I have created another specs file and placed in quickstart folder and run the indexing job to load the data .
I am getting the following error message
Bytes Written=8
2016-07-13T08:34:11,107 ERROR [task-runner-0-priority-0] io.druid.indexer.IndexGeneratorJob - [File var/druid/hadoop-tmp/sales/2016-07-13T083357.479Z/ce6b738c2f6749098763ac533b80abed/segmentDescriptorInfo does not exist] SegmentDescriptorInfo is not found usually when indexing process did not produce any segments meaning either there was no input data to process or all the input events were discarded due to some error
2016-07-13T08:34:11,110 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_sales_2016-07-13T08:33:53.220Z, type=index_hadoop, dataSource=sales}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:204) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:208) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_92]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_92]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_92]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_92]
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_92]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_92]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_92]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
... 7 more
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File var/druid/hadoop-tmp/sales/2016-07-13T083357.479Z/ce6b738c2f6749098763ac533b80abed/segmentDescriptorInfo does not exist
bin/post-index-task --file quickstart/sales-index.json
Ingestion specs for the file is given as
{
"type" : "index_hadoop",
"spec" : {
"ioConfig" : {
"type" : "hadoop",
"inputSpec" : {
"type" : "static",
"paths" : "quickstart/sales-2016-06-27-sampled.json"
}
},
"dataSchema": {
"dataSource": "sales",
"granularitySpec": {
"type": "uniform",
"segmentGranularity": "week",
"queryGranularity": "none",
"intervals" : ["2016-06-27/2016-06-28"]
},
"parser": {
"type": "string",
"parseSpec": {
"format": "json",
"timestampSpec": {
"column": "Week",
"format": "YYYYMMDD"
},
"dimensionsSpec": {
"dimensions": ["Week",
"SKU",
"Str",
"Type","Qty"],
"dimensionExclusions": [],
"spatialDimensions": []
}
}
},
"metricsSpec": [{
"type": "count",
"name": "count"
},
{
"type": "doubleSum",
"name": "QtySum",
"fieldName": "Qty"
}],
"tuningConfig" : {
"type" : "hadoop",
"partitionsSpec" : {
"type" : "hashed",
"targetPartitionSize" : 5000000
},
"jobProperties" : {}
}
}
}
}
sales-2016-06-27-sampled.json sample data set is as following
{"Week": 20131014,"SKU": "0001780","Str": "0011100","Type": "abc","Qty": 22}
{"Week": 20131223,"SKU": "0001780","Str": "0001100","Type": "abc","Qty": 2}