Re: [druid-user] Spatial Indexing problem

64 views
Skip to first unread message

Gian Merlino

unread,
Aug 26, 2016, 1:54:24 AM8/26/16
to druid...@googlegroups.com
Hey 马恺熠,

Your spatialDimensions need both "dimName" (the name of the column to store) and "dims" (the names of the input columns containing coordinates in your raw data). The docs say "dims" is optional but I think they are inaccurate.

Gian

On Thu, Aug 4, 2016 at 12:10 PM, 马恺熠 <kyle...@gmail.com> wrote:
Hi:
I'm trying to do a batch ingestion to test the spatial dimension feature, but I get this error from the log. I'm running druid 0.9.1.1.
Here is the error log


2016-08-04T18:42:09,305 WARN [Thread-252] org.apache.hadoop.mapred.LocalJobRunner - job_local591941022_0002
java.lang.Exception: java.lang.NullPointerException
	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) [hadoop-mapreduce-client-common-2.3.0.jar:?]
Caused by: java.lang.NullPointerException
	at com.google.common.collect.Iterables$3.transform(Iterables.java:508) ~[guava-16.0.1.jar:?]
	at com.google.common.collect.Iterables$3.transform(Iterables.java:505) ~[guava-16.0.1.jar:?]
	at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) ~[guava-16.0.1.jar:?]
	at com.google.common.collect.Iterators$5.hasNext(Iterators.java:543) ~[guava-16.0.1.jar:?]
	at com.google.common.collect.Iterators.addAll(Iterators.java:356) ~[guava-16.0.1.jar:?]
	at com.google.common.collect.Sets.newHashSet(Sets.java:238) ~[guava-16.0.1.jar:?]
	at com.google.common.collect.Sets.newHashSet(Sets.java:218) ~[guava-16.0.1.jar:?]
	at io.druid.segment.incremental.SpatialDimensionRowTransformer.<init>(SpatialDimensionRowTransformer.java:63) ~[druid-processing-0.9.1.1.jar:0.9.1.1]
	at io.druid.segment.incremental.IncrementalIndex.<init>(IncrementalIndex.java:435) ~[druid-processing-0.9.1.1.jar:0.9.1.1]
	at io.druid.segment.incremental.OnheapIncrementalIndex.<init>(OnheapIncrementalIndex.java:67) ~[druid-processing-0.9.1.1.jar:0.9.1.1]
	at io.druid.segment.incremental.OnheapIncrementalIndex.<init>(OnheapIncrementalIndex.java:124) ~[druid-processing-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexer.IndexGeneratorJob.makeIncrementalIndex(IndexGeneratorJob.java:233) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexer.IndexGeneratorJob.access$000(IndexGeneratorJob.java:93) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:551) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:469) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0]
	at java.lang.Thread.run(Thread.java:744) ~[?:1.8.0]
2016-08-04T18:42:10,196 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_local591941022_0002 running in uber mode : false
2016-08-04T18:42:10,196 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 0%
2016-08-04T18:42:10,196 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_local591941022_0002 failed with state FAILED due to: NA
2016-08-04T18:42:10,197 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 33
	File System Counters
		FILE: Number of bytes read=39163
		FILE: Number of bytes written=432098
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
	Map-Reduce Framework
		Map input records=10
		Map output records=10
		Map output bytes=4635
		Map output materialized bytes=4681
		Input split bytes=269
		Combine input records=0
		Combine output records=0
		Reduce input groups=0
		Reduce shuffle bytes=4681
		Reduce input records=0
		Reduce output records=0
		Spilled Records=10
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=0
		CPU time spent (ms)=0
		Physical memory (bytes) snapshot=0
		Virtual memory (bytes) snapshot=0
		Total committed heap usage (bytes)=241696768
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=0
2016-08-04T18:42:10,215 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_yellow_tripdata_JSON_2016-08-04T18:42:01.610Z, type=index_hadoop, dataSource=yellow_tripdata_JSON}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:204) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:208) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0]
	at java.lang.Thread.run(Thread.java:744) [?:1.8.0]
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0]
	at java.lang.reflect.Method.invoke(Method.java:483) ~[?:1.8.0]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	... 7 more
Caused by: com.metamx.common.ISE: Job[class io.druid.indexer.IndexGeneratorJob] failed!
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:343) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:94) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:261) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0]
	at java.lang.reflect.Method.invoke(Method.java:483) ~[?:1.8.0]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	... 7 more
2016-08-04T18:42:10,221 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_yellow_tripdata_JSON_2016-08-04T18:42:01.610Z] status changed to [FAILED].
2016-08-04T18:42:10,223 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_hadoop_yellow_tripdata_JSON_2016-08-04T18:42:01.610Z",
  "status" : "FAILED",
  "duration" : 5857
}

--------------------------------------------------------------------------------------------------------------
my spec file:
    "dataSchema" : {
      "dataSource" : "yellow_tripdata_JSON",
      "granularitySpec" : {
        "type" : "uniform",
        "segmentGranularity" : "day",
        "queryGranularity" : "none",
        "intervals" : ["2015-08-01/2015-09-01"]
      },
      "parser" : {
        "type" : "string",
        "parseSpec" : {
          "format" : "json",
          "dimensionsSpec" : {
              "dimensions" : [
              "VendorID",
              "tpep_dropoff_datetime",
              "RatecodeID",
              "store_and_fwd_flag",
              "payment_type"
            ],
              "spatialDimensions": [
                {
                  "dimName": "pickup_longitude"
                },
                {
                  "dimName": "pickup_latitude"
                },
                {
                  "dimName": "dropoff_longitude"
                },
                {
                  "dimName": "dropoff_latitude"
                }
              ]
          },
          "timestampSpec" : {
            "format" : "auto",
            "column" : "tpep_pickup_datetime"
          }
        }
      },
      "metricsSpec" : [
        {
          "name" : "passenger_count",
          "type" : "longSum",
          "fieldName" : "passenger_count"
        },
        {
          "name" : "trip_distance",
          "type" : "doubleSum",
          "fieldName" : "trip_distance"
        },
        {
          "name" : "fare_amount",
          "type" : "doubleSum",
          "fieldName" : "fare_amount"
        },
        {
          "name" : "extra",
          "type" : "doubleSum",
          "fieldName" : "extra"
        },
        {
          "name" : "mta_tax",
          "type" : "doubleSum",
          "fieldName" : "mta_tax"
        },
        {
          "name" : "tip_amount",
          "type" : "doubleSum",
          "fieldName" : "tip_amount"
        },
        {
          "name" : "tolls_amount",
          "type" : "doubleSum",
          "fieldName" : "tolls_amount"
        },
        {
          "name" : "improvement_surcharge",
          "type" : "doubleSum",
          "fieldName" : "improvement_surcharge"
        },
        {
          "name" : "total_amount",
          "type" : "doubleSum",
          "fieldName" : "total_amount"
        }
      ]
    },

and sample data:
{"VendorID":2,"tpep_pickup_datetime":"2015-08-01T00:00:15Z","tpep_dropoff_datetime":"2015-08-01T00:36:21Z","passenger_count":1,"trip_distance":7.22,"pickup_longitude":-73.99980926513672,"pickup_latitude":40.74333953857422,"RatecodeID":1,"store_and_fwd_flag":"N","dropoff_longitude":-73.9428482055664,"dropoff_latitude":40.80662155151367,"payment_type":2,"fare_amount":29.5,"extra":0.5,"mta_tax":0.5,"tip_amount":0,"tolls_amount":0,"improvement_surcharge":0.3,"total_amount":30.8}

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/68f08eb7-a5fb-4d89-bce0-fa03c14d6081%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages