Writing S3 files as output of indexing (No such file or directory)

489 views
Skip to first unread message

Udayakumar Pandurangan

unread,
Dec 1, 2016, 7:56:05 PM12/1/16
to Druid User
Hi there,
        I have small setup in AWS and have been setup and up and running. I loaded some files as input data for the druid cluster in S3 (sitting in my personal S3 with right permissions).

I wanted to load this file into the small druid cluster I have built. While doing so, it is able to read the S3 file and map phase completed with processing and probably the reduce phase as well. But when writing the file back to S3, it probably fails with "No Such file or directory"

Configs:

Common:

# For S3:

druid.storage.type=s3

druid.storage.bucket=druid-segment-data

druid.storage.baseKey=druid/segments

druid.s3.accessKey=...

druid.s3.secretKey=....



# For S3:

druid.indexer.logs.type=s3

druid.indexer.logs.s3Bucket=druid-segment-data

druid.indexer.logs.s3Prefix=druid/indexing-logs



Index config Json

[ec2-user@ip-172-31-10-212 druid-0.9.1.1]$ head -20 /home/ec2-user/druid/misc/wikiticker-index-aws-s3.json 

{

  "type" : "index_hadoop",

  "spec" : {

    "ioConfig" : {

      "type" : "hadoop",

      "inputSpec" : {

        "type" : "static",

        "paths" : "s3://druid-sample-data/druid_data_load/wikiticker-2015-09-12-sampled.json"

      }

    },

    "dataSchema" : {

      "dataSource" : "wikiticker",

      "granularitySpec" : {

        "type" : "uniform",

        "segmentGranularity" : "day",

        "queryGranularity" : "none",

        "intervals" : ["2015-09-12/2015-09-13"]

      },

      "parser" : {

        "type" : "string",

:

:

:

:

    "tuningConfig" : {

      "type" : "hadoop",

      "partitionsSpec" : {

        "type" : "hashed",

        "targetPartitionSize" : 5000000

      },

      "jobProperties" : {

        "fs.s3.awsAccessKeyId" : "....",

        "fs.s3.awsSecretAccessKey" : "....",

        "fs.s3.impl" : "org.apache.hadoop.fs.s3native.NativeS3FileSystem"

      }

    }

  }




Here is the log file from the indexing job:

2016-12-02T00:28:09,478 INFO [pool-19-thread-1] org.apache.hadoop.mapred.Task - Task:attempt_local1133588661_0001_r_000000_0 is done. And is in the process of committing
2016-12-02T00:28:09,480 INFO [pool-19-thread-1] org.apache.hadoop.mapred.LocalJobRunner - 1 / 1 copied.
2016-12-02T00:28:09,482 INFO [pool-19-thread-1] org.apache.hadoop.mapred.Task - Task attempt_local1133588661_0001_r_000000_0 is allowed to commit now
2016-12-02T00:28:09,482 INFO [pool-19-thread-1] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_local1133588661_0001_r_000000_0' to file:/home/ec2-user/druid/druid-0.9.1.1/var/druid/hadoop-tmp/wikiticker/2016-12-02T002750.494Z/f1635bb75df44fa7b08db9407d6d6820/groupedData/_temporary/0/task_local1133588661_0001_r_000000
2016-12-02T00:28:09,490 INFO [pool-19-thread-1] org.apache.hadoop.mapred.LocalJobRunner - reduce > reduce
2016-12-02T00:28:09,490 INFO [pool-19-thread-1] org.apache.hadoop.mapred.Task - Task 'attempt_local1133588661_0001_r_000000_0' done.
2016-12-02T00:28:09,490 INFO [pool-19-thread-1] org.apache.hadoop.mapred.LocalJobRunner - Finishing task: attempt_local1133588661_0001_r_000000_0
2016-12-02T00:28:09,490 INFO [Thread-38] org.apache.hadoop.mapred.LocalJobRunner - reduce task executor complete.
2016-12-02T00:28:10,432 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 100%
2016-12-02T00:28:10,433 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_local1133588661_0001 completed successfully
2016-12-02T00:28:10,452 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 38
	File System Counters
		FILE: Number of bytes read=2894
		FILE: Number of bytes written=428546
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		S3: Number of bytes read=34212512
		S3: Number of bytes written=0
		S3: Number of read operations=0
		S3: Number of large read operations=0
		S3: Number of write operations=0
	Map-Reduce Framework
		Map input records=39244
		Map output records=1
		Map output bytes=1043
		Map output materialized bytes=1053
		Input split bytes=320
		Combine input records=0
		Combine output records=0
		Reduce input groups=1
		Reduce shuffle bytes=1053
		Reduce input records=1
		Reduce output records=0
		Spilled Records=2
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=306
		CPU time spent (ms)=0
		Physical memory (bytes) snapshot=0
		Virtual memory (bytes) snapshot=0
		Total committed heap usage (bytes)=313638912
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=106
2016-12-02T00:28:10,452 INFO [task-runner-0-priority-0] io.druid.indexer.DetermineHashedPartitionsJob - Job completed, loading up partitions for intervals[Optional.of([2015-09-12T00:00:00.000Z/2015-09-13T00:00:00.000Z])].
2016-12-02T00:28:10,466 INFO [task-runner-0-priority-0] io.druid.indexer.DetermineHashedPartitionsJob - Found approximately [40,337] rows in data.
2016-12-02T00:28:10,466 INFO [task-runner-0-priority-0] io.druid.indexer.DetermineHashedPartitionsJob - Creating [1] shards
2016-12-02T00:28:10,467 INFO [task-runner-0-priority-0] io.druid.indexer.DetermineHashedPartitionsJob - DetermineHashedPartitionsJob took 9979 millis
2016-12-02T00:28:10,467 INFO [task-runner-0-priority-0] io.druid.indexer.JobHelper - Deleting path[var/druid/hadoop-tmp/wikiticker/2016-12-02T002750.494Z/f1635bb75df44fa7b08db9407d6d6820]
2016-12-02T00:28:10,528 INFO [task-runner-0-priority-0] io.druid.indexing.common.actions.RemoteTaskActionClient - Performing action for task[index_hadoop_wikiticker_2016-12-02T00:27:50.494Z]: LockListAction{}
2016-12-02T00:28:10,534 INFO [task-runner-0-priority-0] io.druid.indexing.common.actions.RemoteTaskActionClient - Submitting action for task[index_hadoop_wikiticker_2016-12-02T00:27:50.494Z] to overlord[http://ip-172-31-10-212.us-west-1.compute.internal:8090/druid/indexer/v1/action]: LockListAction{}
2016-12-02T00:28:10,540 INFO [task-runner-0-priority-0] io.druid.indexing.common.task.HadoopIndexTask - Setting version to: 2016-12-02T00:27:50.497Z
2016-12-02T00:28:10,552 INFO [task-runner-0-priority-0] io.druid.indexer.HadoopDruidIndexerConfig - Running with config:
{
:
:
      },
      "combineText" : false,
      "useCombiner" : false,
      "buildV9Directly" : false,
      "numBackgroundPersistThreads" : 0
    },
    "uniqueId" : "f1635bb75df44fa7b08db9407d6d6820"
  }
}
2016-12-02T00:28:16,450 INFO [Thread-59] org.apache.hadoop.mapred.LocalJobRunner - reduce task executor complete.
2016-12-02T00:28:16,452 WARN [Thread-59] org.apache.hadoop.mapred.LocalJobRunner - job_local1928146517_0002
java.lang.Exception: java.io.IOException: No such file or directory
	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) [hadoop-mapreduce-client-common-2.3.0.jar:?]
Caused by: java.io.IOException: No such file or directory
	at java.io.UnixFileSystem.createFileExclusively(Native Method) ~[?:1.7.0_121]
	at java.io.File.createTempFile(File.java:2001) ~[?:1.7.0_121]
	at java.io.File.createTempFile(File.java:2047) ~[?:1.7.0_121]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:558) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:469) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) ~[hadoop-mapreduce-client-core-2.3.0.jar:?]
	at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) ~[hadoop-mapreduce-client-common-2.3.0.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[?:1.7.0_121]
	at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[?:1.7.0_121]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[?:1.7.0_121]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[?:1.7.0_121]
	at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_121]
2016-12-02T00:28:16,869 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 0%
2016-12-02T00:28:16,869 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_local1928146517_0002 failed with state FAILED due to: NA
2016-12-02T00:28:16,873 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 38
	File System Counters
		FILE: Number of bytes read=2902
		FILE: Number of bytes written=17321755
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		S3: Number of bytes read=34212512
		S3: Number of bytes written=0
		S3: Number of read operations=0
		S3: Number of large read operations=0
		S3: Number of write operations=0
:
:
:
2016-12-02T00:28:16,890 INFO [task-runner-0-priority-0] io.druid.indexer.JobHelper - Deleting path[var/druid/hadoop-tmp/wikiticker/2016-12-02T002750.497Z/f1635bb75df44fa7b08db9407d6d6820]
2016-12-02T00:28:16,928 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_wikiticker_2016-12-02T00:27:50.494Z, type=index_hadoop, dataSource=wikiticker}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:204) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:208) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_121]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_121]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_121]
	at java.lang.Thread.run(Thread.java:745) [?:1.7.0_121]
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_121]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_121]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_121]
	at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_121]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	... 7 more
Caused by: com.metamx.common.ISE: Job[class io.druid.indexer.IndexGeneratorJob] failed!
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:343) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:94) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:261) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_121]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_121]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_121]
	at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_121]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	... 7 more
2016-12-02T00:28:16,948 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_wikiticker_2016-12-02T00:27:50.494Z] status changed to [FAILED].
2016-12-02T00:28:16,950 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_hadoop_wikiticker_2016-12-02T00:27:50.494Z",
  "status" : "FAILED",
  "duration" : 20999
}





I tried a different combinations of file permissions and directory and bucket permissions. What is it exactly failing for? It says "NA" for failed due to. So I can't understand what is going wrong. I understand from the logs that it completed map phase (read all the 36k+ rows) and probably compled the reduce phase also which is to get the final merged segment. But while trying ti write may be it is failing - but I can't surely say that is what is happening. But that's my guess. 

Please help me out here. Any help will be greatly appreciated.


Thanks,
Uday.

Udayakumar Pandurangan

unread,
Dec 1, 2016, 7:57:51 PM12/1/16
to Druid User
Btw - I see that the indexing-logs directory getting created under the druid-segment-data bucket - but not the final segments.

Thanks,
Uday.

Jonathan Wei

unread,
Dec 1, 2016, 10:00:36 PM12/1/16
to druid...@googlegroups.com
Hi Uday,

Looks like the indexer cannot create a temp directory:

Caused by: java.io.IOException: No such file or directory
	at java.io.UnixFileSystem.createFileExclusively(Native Method) ~[?:1.7.0_121]
	at java.io.File.createTempFile(File.java:2001) ~[?:1.7.0_121]
	at java.io.File.createTempFile(File.java:2047) ~[?:1.7.0_121]
	at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:558) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]

Can you check the location set for "java.io.tmpdir" and ensure that it's accessible?

Thanks,
Jon

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/f3efee75-5281-499d-9182-ca688fcec583%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Udayakumar Pandurangan

unread,
Dec 2, 2016, 12:59:35 AM12/2/16
to Druid User
Thanks a lot Jonathan Wei!

That is indeed the problem here. Its working fine now - Really appreciate your help! :)

Thanks,
Uday.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages