Trouble setting up Druid via Quickstart documentation

415 views
Skip to first unread message

Carsten Steckel

unread,
Oct 13, 2016, 11:27:47 AM10/13/16
to Druid User
Hi,

I have set up a druid-0.9.1.1 as described in http://druid.io/docs/latest/tutorials/quickstart.html on ubuntu 16.04 with oracle jdk 1.8 in directory /var/lib/druid (owned by user:group=druid:druid). All zookeeper and druid services are running with the ./conf-quickstart settings under user druid. bin/init has been executed.

When I start the load of the wikiticker data with 

> curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/wikiticker-index.json localhost:8090/druid/indexer/v1/task

{"task":"index_hadoop_wikiticker_2016-10-13T15:18:01.519Z"}

I can watch the progress on localhost:8090 but the jobs fails. See log at end of mail. It seems it cannot execute createTempFile successfully. I can see that other services have successfully found either /tmp or /var/lib/druid/var/druid/hadoop-tmp.

Any ideas or help? 

Cheers
Carsten

[...]
2016-10-13T15:18:15,559 INFO [pool-22-thread-1] io.druid.indexer.HadoopDruidIndexerConfig - Running with config:
{
  "spec" : {
    "dataSchema" : {
      "dataSource" : "wikiticker",
      "parser" : {
        "type" : "string",
        "parseSpec" : {
          "format" : "json",
          "dimensionsSpec" : {
            "dimensions" : [ "channel", "cityName", "comment", "countryIsoCode", "countryName", "isAnonymous", "isMinor", "isNew", "isRobot", "isUnpatrolled", "metroCode", "namespace", "page", "regionIsoCode", "regionName", "user" ]
          },
          "timestampSpec" : {
            "format" : "auto",
            "column" : "time"
          }
        }
      },
      "metricsSpec" : [ {
        "type" : "count",
        "name" : "count"
      }, {
        "type" : "longSum",
        "name" : "added",
        "fieldName" : "added"
      }, {
        "type" : "longSum",
        "name" : "deleted",
        "fieldName" : "deleted"
      }, {
        "type" : "longSum",
        "name" : "delta",
        "fieldName" : "delta"
      }, {
        "type" : "hyperUnique",
        "name" : "user_unique",
        "fieldName" : "user"
      } ],
      "granularitySpec" : {
        "type" : "uniform",
        "segmentGranularity" : "DAY",
        "queryGranularity" : {
          "type" : "none"
        },
        "intervals" : [ "2015-09-12T00:00:00.000Z/2015-09-13T00:00:00.000Z" ]
      }
    },
    "ioConfig" : {
      "type" : "hadoop",
      "inputSpec" : {
        "type" : "static",
        "paths" : "quickstart/wikiticker-2015-09-12-sampled.json"
      },
      "metadataUpdateSpec" : null,
      "segmentOutputPath" : "file:/var/lib/druid/var/druid/segments/wikiticker"
    },
    "tuningConfig" : {
      "type" : "hadoop",
      "workingPath" : "var/druid/hadoop-tmp",
      "version" : "2016-10-13T15:18:01.586Z",
      "partitionsSpec" : {
        "type" : "hashed",
        "targetPartitionSize" : 5000000,
        "maxPartitionSize" : 7500000,
        "assumeGrouped" : false,
        "numShards" : -1,
        "partitionDimensions" : [ ]
      },
      "shardSpecs" : {
        "2015-09-12T00:00:00.000Z" : [ {
          "actualSpec" : {
            "type" : "none"
          },
          "shardNum" : 0
        } ]
      },
      "indexSpec" : {
        "bitmap" : {
          "type" : "concise"
        },
        "dimensionCompression" : null,
        "metricCompression" : null
      },
      "maxRowsInMemory" : 75000,
      "leaveIntermediate" : false,
      "cleanupOnFailure" : true,
      "overwriteFiles" : false,
      "ignoreInvalidRows" : false,
      "jobProperties" : { },
      "combineText" : false,
      "useCombiner" : false,
      "buildV9Directly" : false,
      "numBackgroundPersistThreads" : 0
    },
    "uniqueId" : "a6be0dee233a43229ff8d09674f06ec8"
  }
}
2016-10-13T15:18:15,570 INFO [Thread-42] org.apache.hadoop.mapred.LocalJobRunner - reduce task executor complete.
2016-10-13T15:18:15,576 WARN [Thread-42] org.apache.hadoop.mapred.LocalJobRunner - job_local1788840695_0002
java.lang.Exception: java.io.IOException: No such file or directory
	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) [hadoop-mapreduce-client-common-2.3.0.jar:?]
Caused by: java.io.IOException: No such file or directory
	at java.io.UnixFileSystem.createFileExclusively(Native Method) ~[?:1.8.0_101]
	at java.io.File.createTempFile(File.java:2024) ~[?:1.8.0_101]
	at java.io.File.createTempFile(File.java:2070) ~[?:1.8.0_101]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:558) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:469) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_101]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_101]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_101]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_101]
	at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_101]
2016-10-13T15:18:15,677 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 0%
2016-10-13T15:18:15,677 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_local1788840695_0002 failed with state FAILED due to: NA
2016-10-13T15:18:15,681 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 33
	File System Counters
		FILE: Number of bytes read=34215398
		FILE: Number of bytes written=17311661
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
	Map-Reduce Framework
		Map input records=39244
		Map output records=39244
		Map output bytes=16736001
		Map output materialized bytes=16892983
		Input split bytes=295
		Combine input records=0
		Combine output records=0
		Reduce input groups=0
		Reduce shuffle bytes=16892983
		Reduce input records=0
		Reduce output records=0
		Spilled Records=39244
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=135
		CPU time spent (ms)=0
		Physical memory (bytes) snapshot=0
		Virtual memory (bytes) snapshot=0
		Total committed heap usage (bytes)=811073536
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=0





Carsten Steckel

unread,
Oct 13, 2016, 11:37:12 AM10/13/16
to Druid User
Darn,

first I am looking around for about 2 hours and find nothing until I read an interesting post in this forum, telling me that in the jvm.config the java.io.tmpdir is set. So I check if var/tmp exists and for what ever reason it did not. So bin/init has not properly run and I did not notice LoL

Problem solved.

Cheers
Carsten

Fangjin Yang

unread,
Oct 31, 2016, 8:45:21 PM10/31/16
to Druid User
For future reference, the Druid quickstart is based off of https://imply.io/docs/latest/quickstart, so it may be easier to try that instead.
Reply all
Reply to author
Forward
0 new messages