I tried to load different files, all hadoop jobs finished successfully but in console I've got error.
Here is full log of running HadoopDruidIndexer:
2013-08-27 11:47:33,463 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking start method[public void com.metamx.druid.indexer.HadoopDruidIndexerNode.start() throws java.lang.Exception] on object[com.metamx.druid.indexer.HadoopDruidIndexerNode@7a3f437c].
2013-08-27 11:47:34,431 INFO [main] com.metamx.druid.indexer.HadoopDruidIndexerConfig - Running with config:
{
"dataSource" : "datainsight",
"timestampColumn" : "date_time",
"timestampFormat" : "yyyy-MM-dd HH:mm:ss",
"dataSpec" : {
"format" : "json",
"dimensions" : [ "daily_table", "hour", "exchange", "publisher_id", "size", "advertiser_id", "campaign_id", "line_item_id", "pricing_type", "creative_id", "browser", "os", "geo_country", "geo_region", "country_region", "city" ],
"spatialDimensions" : [ ]
},
"granularitySpec" : {
"type" : "uniform",
"gran" : "HOUR",
"intervals" : [ "2013-07-01T00:00:00.000Z/2013-07-02T00:00:00.000Z" ]
"rollupGranularity" : {
"type" : "duration",
"duration" : 60000,
"origin" : "1970-01-01T00:00:00.000Z"
},
"rowFlushBoundary" : 500000
},
"updaterJobSpec" : {
"type" : "db",
"connectURI" : "jdbc:mysql://my-server:3306/druid",
"user" : "druid",
"password" : "diurd",
"segmentTable" : "segments",
"useValidationQuery" : false,
"validationQuery" : "SELECT 1"
},
"ignoreInvalidRows" : false,
"registererers" : null
}
2013-08-27 11:47:34,855 WARN [main] org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2013-08-27 11:47:34,869 INFO [main] com.metamx.druid.indexer.path.StaticPathSpec - Adding paths[hdfs://my-server:8020/user/user1/logs/druid/log_imp/2013-07-01]
2013-08-27 11:47:35,697 INFO [main] com.metamx.druid.indexer.path.StaticPathSpec - Adding paths[hdfs://my-server:8020/user/user1/logs/druid/log_imp/2013-07-01]
2013-08-27 11:47:35,927 WARN [main] org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2013-08-27 11:47:37,115 INFO [main] org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 3
2013-08-27 11:47:37,518 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Job datainsight-determine_partitions_groupby-[2013-07-01T00:00:00.000Z/2013-07-02T00:00:00.000Z] submitted, status available at:
http://my-server:50030/jobdetails.jsp?jobid=job_201307151148_14592013-08-27 11:47:37,519 INFO [main] org.apache.hadoop.mapred.JobClient - Running job: job_201307151148_1459
2013-08-27 11:47:38,523 INFO [main] org.apache.hadoop.mapred.JobClient - map 0% reduce 0%
2013-08-27 11:47:48,561 INFO [main] org.apache.hadoop.mapred.JobClient - map 33% reduce 0%
2013-08-27 11:47:51,572 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 0%
2013-08-27 11:47:54,583 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 100%
2013-08-27 11:47:56,593 INFO [main] org.apache.hadoop.mapred.JobClient - Job complete: job_201307151148_1459
2013-08-27 11:47:56,606 INFO [main] org.apache.hadoop.mapred.JobClient - Counters: 32
2013-08-27 11:47:56,608 INFO [main] org.apache.hadoop.mapred.JobClient - File System Counters
2013-08-27 11:47:56,610 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of bytes read=6
2013-08-27 11:47:56,610 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of bytes written=662007
2013-08-27 11:47:56,611 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of read operations=0
2013-08-27 11:47:56,611 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of large read operations=0
2013-08-27 11:47:56,611 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of write operations=0
2013-08-27 11:47:56,611 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of bytes read=1664064
2013-08-27 11:47:56,611 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of bytes written=95
2013-08-27 11:47:56,611 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of read operations=6
2013-08-27 11:47:56,611 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of large read operations=0
2013-08-27 11:47:56,611 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of write operations=1
2013-08-27 11:47:56,613 INFO [main] org.apache.hadoop.mapred.JobClient - Job Counters
2013-08-27 11:47:56,613 INFO [main] org.apache.hadoop.mapred.JobClient - Launched map tasks=3
2013-08-27 11:47:56,614 INFO [main] org.apache.hadoop.mapred.JobClient - Launched reduce tasks=1
2013-08-27 11:47:56,614 INFO [main] org.apache.hadoop.mapred.JobClient - Data-local map tasks=3
2013-08-27 11:47:56,614 INFO [main] org.apache.hadoop.mapred.JobClient - Total time spent by all maps in occupied slots (ms)=25685
2013-08-27 11:47:56,614 INFO [main] org.apache.hadoop.mapred.JobClient - Total time spent by all reduces in occupied slots (ms)=5673
2013-08-27 11:47:56,614 INFO [main] org.apache.hadoop.mapred.JobClient - Total time spent by all maps waiting after reserving slots (ms)=0
2013-08-27 11:47:56,614 INFO [main] org.apache.hadoop.mapred.JobClient - Total time spent by all reduces waiting after reserving slots (ms)=0
2013-08-27 11:47:56,616 INFO [main] org.apache.hadoop.mapred.JobClient - Map-Reduce Framework
2013-08-27 11:47:56,616 INFO [main] org.apache.hadoop.mapred.JobClient - Map input records=4202
2013-08-27 11:47:56,616 INFO [main] org.apache.hadoop.mapred.JobClient - Map output records=0
2013-08-27 11:47:56,616 INFO [main] org.apache.hadoop.mapred.JobClient - Map output bytes=0
2013-08-27 11:47:56,616 INFO [main] org.apache.hadoop.mapred.JobClient - Input split bytes=594
2013-08-27 11:47:56,616 INFO [main] org.apache.hadoop.mapred.JobClient - Combine input records=0
2013-08-27 11:47:56,616 INFO [main] org.apache.hadoop.mapred.JobClient - Combine output records=0
2013-08-27 11:47:56,616 INFO [main] org.apache.hadoop.mapred.JobClient - Reduce input groups=0
2013-08-27 11:47:56,617 INFO [main] org.apache.hadoop.mapred.JobClient - Reduce shuffle bytes=18
2013-08-27 11:47:56,617 INFO [main] org.apache.hadoop.mapred.JobClient - Reduce input records=0
2013-08-27 11:47:56,617 INFO [main] org.apache.hadoop.mapred.JobClient - Reduce output records=0
2013-08-27 11:47:56,617 INFO [main] org.apache.hadoop.mapred.JobClient - Spilled Records=0
2013-08-27 11:47:56,617 INFO [main] org.apache.hadoop.mapred.JobClient - CPU time spent (ms)=9810
2013-08-27 11:47:56,617 INFO [main] org.apache.hadoop.mapred.JobClient - Physical memory (bytes) snapshot=826146816
2013-08-27 11:47:56,617 INFO [main] org.apache.hadoop.mapred.JobClient - Virtual memory (bytes) snapshot=3951435776
2013-08-27 11:47:56,617 INFO [main] org.apache.hadoop.mapred.JobClient - Total committed heap usage (bytes)=672727040
2013-08-27 11:47:56,661 WARN [main] org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2013-08-27 11:47:57,465 INFO [main] org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2013-08-27 11:47:57,679 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Job datainsight-determine_partitions_dimselection-[2013-07-01T00:00:00.000Z/2013-07-02T00:00:00.000Z] submitted, status available at:
http://my-server:50030/jobdetails.jsp?jobid=job_201307151148_14602013-08-27 11:47:57,679 INFO [main] org.apache.hadoop.mapred.JobClient - Running job: job_201307151148_1460
2013-08-27 11:47:58,681 INFO [main] org.apache.hadoop.mapred.JobClient - map 0% reduce 0%
2013-08-27 11:48:07,713 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 0%
2013-08-27 11:48:13,737 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 8%
2013-08-27 11:48:17,751 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 13%
2013-08-27 11:48:18,755 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 17%
2013-08-27 11:48:21,765 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 21%
2013-08-27 11:48:23,771 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 25%
2013-08-27 11:48:25,778 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 29%
2013-08-27 11:48:28,788 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 33%
2013-08-27 11:48:29,791 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 37%
2013-08-27 11:48:32,800 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 42%
2013-08-27 11:48:34,807 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 46%
2013-08-27 11:48:37,816 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 50%
2013-08-27 11:48:38,819 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 54%
2013-08-27 11:48:41,830 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 58%
2013-08-27 11:48:42,833 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 63%
2013-08-27 11:48:46,849 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 67%
2013-08-27 11:48:47,853 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 71%
2013-08-27 11:48:51,874 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 79%
2013-08-27 11:48:55,887 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 88%
2013-08-27 11:49:00,903 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 96%
2013-08-27 11:49:04,920 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 100%
2013-08-27 11:49:06,930 INFO [main] org.apache.hadoop.mapred.JobClient - Job complete: job_201307151148_1460
2013-08-27 11:49:06,933 INFO [main] org.apache.hadoop.mapred.JobClient - Counters: 32
2013-08-27 11:49:06,933 INFO [main] org.apache.hadoop.mapred.JobClient - File System Counters
2013-08-27 11:49:06,933 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of bytes read=144
2013-08-27 11:49:06,933 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of bytes written=4167174
2013-08-27 11:49:06,933 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of read operations=0
2013-08-27 11:49:06,933 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of large read operations=0
2013-08-27 11:49:06,933 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of write operations=0
2013-08-27 11:49:06,933 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of bytes read=249
2013-08-27 11:49:06,934 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of bytes written=0
2013-08-27 11:49:06,934 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of read operations=3
2013-08-27 11:49:06,934 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of large read operations=0
2013-08-27 11:49:06,934 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of write operations=0
2013-08-27 11:49:06,934 INFO [main] org.apache.hadoop.mapred.JobClient - Job Counters
2013-08-27 11:49:06,934 INFO [main] org.apache.hadoop.mapred.JobClient - Launched map tasks=1
2013-08-27 11:49:06,934 INFO [main] org.apache.hadoop.mapred.JobClient - Launched reduce tasks=25
2013-08-27 11:49:06,934 INFO [main] org.apache.hadoop.mapred.JobClient - Data-local map tasks=1
2013-08-27 11:49:06,934 INFO [main] org.apache.hadoop.mapred.JobClient - Total time spent by all maps in occupied slots (ms)=9414
2013-08-27 11:49:06,934 INFO [main] org.apache.hadoop.mapred.JobClient - Total time spent by all reduces in occupied slots (ms)=106428
2013-08-27 11:49:06,934 INFO [main] org.apache.hadoop.mapred.JobClient - Total time spent by all maps waiting after reserving slots (ms)=0
2013-08-27 11:49:06,935 INFO [main] org.apache.hadoop.mapred.JobClient - Total time spent by all reduces waiting after reserving slots (ms)=0
2013-08-27 11:49:06,935 INFO [main] org.apache.hadoop.mapred.JobClient - Map-Reduce Framework
2013-08-27 11:49:06,935 INFO [main] org.apache.hadoop.mapred.JobClient - Map input records=0
2013-08-27 11:49:06,935 INFO [main] org.apache.hadoop.mapred.JobClient - Map output records=0
2013-08-27 11:49:06,935 INFO [main] org.apache.hadoop.mapred.JobClient - Map output bytes=0
2013-08-27 11:49:06,935 INFO [main] org.apache.hadoop.mapred.JobClient - Input split bytes=154
2013-08-27 11:49:06,935 INFO [main] org.apache.hadoop.mapred.JobClient - Combine input records=0
2013-08-27 11:49:06,935 INFO [main] org.apache.hadoop.mapred.JobClient - Combine output records=0
2013-08-27 11:49:06,935 INFO [main] org.apache.hadoop.mapred.JobClient - Reduce input groups=0
2013-08-27 11:49:06,935 INFO [main] org.apache.hadoop.mapred.JobClient - Reduce shuffle bytes=144
2013-08-27 11:49:06,935 INFO [main] org.apache.hadoop.mapred.JobClient - Reduce input records=0
2013-08-27 11:49:06,935 INFO [main] org.apache.hadoop.mapred.JobClient - Reduce output records=0
2013-08-27 11:49:06,936 INFO [main] org.apache.hadoop.mapred.JobClient - Spilled Records=0
2013-08-27 11:49:06,936 INFO [main] org.apache.hadoop.mapred.JobClient - CPU time spent (ms)=66260
2013-08-27 11:49:06,936 INFO [main] org.apache.hadoop.mapred.JobClient - Physical memory (bytes) snapshot=3491840000
2013-08-27 11:49:06,936 INFO [main] org.apache.hadoop.mapred.JobClient - Virtual memory (bytes) snapshot=24944799744
2013-08-27 11:49:06,936 INFO [main] org.apache.hadoop.mapred.JobClient - Total committed heap usage (bytes)=3010920448
2013-08-27 11:49:06,940 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Job completed, loading up partitions for intervals[[2013-07-01T00:00:00.000Z/2013-07-01T01:00:00.000Z, 2013-07-01T01:00:00.000Z/2013-07-01T02:00:00.000Z, 2013-07-01T02:00:00.000Z/2013-07-01T03:00:00.000Z, 2013-07-01T03:00:00.000Z/2013-07-01T04:00:00.000Z, 2013-07-01T04:00:00.000Z/2013-07-01T05:00:00.000Z, 2013-07-01T05:00:00.000Z/2013-07-01T06:00:00.000Z, 2013-07-01T06:00:00.000Z/2013-07-01T07:00:00.000Z, 2013-07-01T07:00:00.000Z/2013-07-01T08:00:00.000Z, 2013-07-01T08:00:00.000Z/2013-07-01T09:00:00.000Z, 2013-07-01T09:00:00.000Z/2013-07-01T10:00:00.000Z, 2013-07-01T10:00:00.000Z/2013-07-01T11:00:00.000Z, 2013-07-01T11:00:00.000Z/2013-07-01T12:00:00.000Z, 2013-07-01T12:00:00.000Z/2013-07-01T13:00:00.000Z, 2013-07-01T13:00:00.000Z/2013-07-01T14:00:00.000Z, 2013-07-01T14:00:00.000Z/2013-07-01T15:00:00.000Z, 2013-07-01T15:00:00.000Z/2013-07-01T16:00:00.000Z, 2013-07-01T16:00:00.000Z/2013-07-01T17:00:00.000Z, 2013-07-01T17:00:00.000Z/2013-07-01T18:00:00.000Z, 2013-07-01T18:00:00.000Z/2013-07-01T19:00:00.000Z, 2013-07-01T19:00:00.000Z/2013-07-01T20:00:00.000Z, 2013-07-01T20:00:00.000Z/2013-07-01T21:00:00.000Z, 2013-07-01T21:00:00.000Z/2013-07-01T22:00:00.000Z, 2013-07-01T22:00:00.000Z/2013-07-01T23:00:00.000Z, 2013-07-01T23:00:00.000Z/2013-07-02T00:00:00.000Z]].
2013-08-27 11:49:06,946 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T000000.000Z_20130701T010000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,948 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T010000.000Z_20130701T020000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,951 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T020000.000Z_20130701T030000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,956 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T030000.000Z_20130701T040000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,959 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T040000.000Z_20130701T050000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,963 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T050000.000Z_20130701T060000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,965 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T060000.000Z_20130701T070000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,967 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T070000.000Z_20130701T080000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,970 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T080000.000Z_20130701T090000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,972 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T090000.000Z_20130701T100000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,974 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T100000.000Z_20130701T110000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,976 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T110000.000Z_20130701T120000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,979 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T120000.000Z_20130701T130000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,981 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T130000.000Z_20130701T140000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,983 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T140000.000Z_20130701T150000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,985 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T150000.000Z_20130701T160000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,990 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T160000.000Z_20130701T170000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,992 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T170000.000Z_20130701T180000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,993 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T180000.000Z_20130701T190000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,995 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T190000.000Z_20130701T200000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,997 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T200000.000Z_20130701T210000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:06,999 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T210000.000Z_20130701T220000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:07,001 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T220000.000Z_20130701T230000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:07,003 INFO [main] com.metamx.druid.indexer.DeterminePartitionsJob - Path[hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/20130701T230000.000Z_20130702T000000.000Z/partitions.json] didn't exist!?
2013-08-27 11:49:07,029 INFO [main] com.metamx.druid.indexer.path.StaticPathSpec - Adding paths[hdfs://my-server:8020/user/user1/logs/druid/log_imp/2013-07-01]
2013-08-27 11:49:07,039 WARN [main] org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2013-08-27 11:49:07,936 INFO [main] org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 3
2013-08-27 11:49:08,262 INFO [main] com.metamx.druid.indexer.IndexGeneratorJob - Job datainsight-index-generator-[2013-07-01T00:00:00.000Z/2013-07-02T00:00:00.000Z] submitted, status available at
http://my-server:50030/jobdetails.jsp?jobid=job_201307151148_14612013-08-27 11:49:08,262 INFO [main] org.apache.hadoop.mapred.JobClient - Running job: job_201307151148_1461
2013-08-27 11:49:09,264 INFO [main] org.apache.hadoop.mapred.JobClient - map 0% reduce 0%
2013-08-27 11:49:21,304 INFO [main] org.apache.hadoop.mapred.JobClient - map 33% reduce 0%
2013-08-27 11:49:25,317 INFO [main] org.apache.hadoop.mapred.JobClient - map 100% reduce 0%
2013-08-27 11:49:27,326 INFO [main] org.apache.hadoop.mapred.JobClient - Job complete: job_201307151148_1461
2013-08-27 11:49:27,329 INFO [main] org.apache.hadoop.mapred.JobClient - Counters: 24
2013-08-27 11:49:27,329 INFO [main] org.apache.hadoop.mapred.JobClient - File System Counters
2013-08-27 11:49:27,329 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of bytes read=0
2013-08-27 11:49:27,329 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of bytes written=496559
2013-08-27 11:49:27,329 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of read operations=0
2013-08-27 11:49:27,330 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of large read operations=0
2013-08-27 11:49:27,330 INFO [main] org.apache.hadoop.mapred.JobClient - FILE: Number of write operations=0
2013-08-27 11:49:27,330 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of bytes read=1664064
2013-08-27 11:49:27,330 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of bytes written=0
2013-08-27 11:49:27,330 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of read operations=6
2013-08-27 11:49:27,331 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of large read operations=0
2013-08-27 11:49:27,331 INFO [main] org.apache.hadoop.mapred.JobClient - HDFS: Number of write operations=3
2013-08-27 11:49:27,331 INFO [main] org.apache.hadoop.mapred.JobClient - Job Counters
2013-08-27 11:49:27,331 INFO [main] org.apache.hadoop.mapred.JobClient - Launched map tasks=3
2013-08-27 11:49:27,331 INFO [main] org.apache.hadoop.mapred.JobClient - Data-local map tasks=3
2013-08-27 11:49:27,331 INFO [main] org.apache.hadoop.mapred.JobClient - Total time spent by all maps in occupied slots (ms)=28038
2013-08-27 11:49:27,332 INFO [main] org.apache.hadoop.mapred.JobClient - Total time spent by all reduces in occupied slots (ms)=0
2013-08-27 11:49:27,332 INFO [main] org.apache.hadoop.mapred.JobClient - Total time spent by all maps waiting after reserving slots (ms)=0
2013-08-27 11:49:27,332 INFO [main] org.apache.hadoop.mapred.JobClient - Total time spent by all reduces waiting after reserving slots (ms)=0
2013-08-27 11:49:27,332 INFO [main] org.apache.hadoop.mapred.JobClient - Map-Reduce Framework
2013-08-27 11:49:27,332 INFO [main] org.apache.hadoop.mapred.JobClient - Map input records=4202
2013-08-27 11:49:27,332 INFO [main] org.apache.hadoop.mapred.JobClient - Map output records=0
2013-08-27 11:49:27,333 INFO [main] org.apache.hadoop.mapred.JobClient - Input split bytes=594
2013-08-27 11:49:27,333 INFO [main] org.apache.hadoop.mapred.JobClient - Spilled Records=0
2013-08-27 11:49:27,333 INFO [main] org.apache.hadoop.mapred.JobClient - CPU time spent (ms)=7080
2013-08-27 11:49:27,333 INFO [main] org.apache.hadoop.mapred.JobClient - Physical memory (bytes) snapshot=382013440
2013-08-27 11:49:27,333 INFO [main] org.apache.hadoop.mapred.JobClient - Virtual memory (bytes) snapshot=2971459584
2013-08-27 11:49:27,333 INFO [main] org.apache.hadoop.mapred.JobClient - Total committed heap usage (bytes)=351141888
2013-08-27 11:49:27,388 INFO [main] com.metamx.druid.indexer.HadoopDruidIndexerMain - Throwable caught at startup, committing seppuku
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler.start(Lifecycle.java:331)
at com.metamx.common.lifecycle.Lifecycle.start(Lifecycle.java:250)
at com.metamx.druid.indexer.HadoopDruidIndexerMain.main(HadoopDruidIndexerMain.java:50)
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/segmentDescriptorInfo does not exist.
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at com.metamx.druid.indexer.IndexGeneratorJob.getPublishedSegments(IndexGeneratorJob.java:182)
at com.metamx.druid.indexer.DbUpdaterJob.run(DbUpdaterJob.java:61)
at com.metamx.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:99)
at com.metamx.druid.indexer.HadoopDruidIndexerNode.start(HadoopDruidIndexerNode.java:125)
... 7 more
Caused by: java.io.FileNotFoundException: File hdfs://my-server:8020/tmp/datainsight/2013-08-27T114734.314Z/segmentDescriptorInfo does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:409)
at com.metamx.druid.indexer.IndexGeneratorJob.getPublishedSegments(IndexGeneratorJob.java:175)
... 10 more
In HDFS I have empty part-m-* and groupedData/part-r-* files for this job, but loaded files are not empty. There is json documents separated with new lines