Getting 'Failed to create directory within 10000 attempts' when configured with S3

3,244 views
Skip to first unread message

Prerna Sharma

unread,
Aug 29, 2016, 3:02:07 AM8/29/16
to Druid User
Hi,

I am unable to ingest data after configuring deep storage as S3. I have a cluster of 4 nodes (broker, historical, middle manager and overlord+coordinator)

Below is the exception

2016-08-29T06:47:50,701 INFO [task-runner-0-priority-0] io.druid.segment.realtime.firehose.LocalFirehoseFactory - Found files: [/home/druid/druid-0.9.1.1/hist_data.csv]
2016-08-29T06:47:50,767 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[IndexTask{id=index_temperature_stream_2016-08-29T06:47:41.115Z, type=index, dataSource=temperature_stream}]
java.lang.IllegalStateException: Failed to create directory within 10000 attempts (tried 1472453270713-0 to 1472453270713-9999)
	at com.google.common.io.Files.createTempDir(Files.java:600) ~[guava-16.0.1.jar:?]
	at io.druid.segment.indexing.RealtimeTuningConfig.createNewBasePersistDirectory(RealtimeTuningConfig.java:56) ~[druid-server-0.9.1.1.jar:0.9.1.1]
	at io.druid.segment.indexing.RealtimeTuningConfig.<init>(RealtimeTuningConfig.java:118) ~[druid-server-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.common.task.IndexTask.convertTuningConfig(IndexTask.java:146) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.common.task.IndexTask.generateSegment(IndexTask.java:376) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:221) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_91]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
2016-08-29T06:47:50,775 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_temperature_stream_2016-08-29T06:47:41.115Z] status changed to [FAILED].
2016-08-29T06:47:50,779 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_temperature_stream_2016-08-29T06:47:41.115Z",
  "status" : "FAILED",
  "duration" : 4134
}

I have checked that java.io.tmpdir exists in S3. Could you tell me which directory is this process trying to create? Or is there something that I may have missed?




Prerna Sharma

unread,
Aug 29, 2016, 3:09:54 AM8/29/16
to Druid User
Please find the attached files.

1. Batch Ingestion Task submitted.
2. Common Runtime Properties
3. Middle Manager Runtime Properties
batch_ingest.json
common.runtime.properties
runtime.properties

Nishant Bangarwa

unread,
Aug 29, 2016, 3:34:30 AM8/29/16
to Druid User
Hi Prerna, 
Its trying to make a subdirectory on the middlemanager node under  java.io.tmpdir
Try setting -Djava.io.tmpdir to a directory with read/write permission in your vm arguments.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/d84dc2b6-59ca-4639-8509-60d7821dc09b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Prerna Sharma

unread,
Aug 29, 2016, 11:16:41 PM8/29/16
to Druid User
Thanks a lot Nishant. 

Could you also help me with this exception? This happens when fetching historical data segments from S3. Which directory is it complaining about?

2016-08-30T03:10:40,880 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Completed request [LOAD: temperature_stream_2016-06-04T00:00:00.000Z_2016-06-05T00:00:00.000Z_2016-08-29T07:43:24.379Z]

2016-08-30T03:10:40,880 ERROR [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Failed to load segment for dataSource: {class=io.druid.server.coordination.ZkCoordinator, exceptionType=class io.druid.segment.loading.SegmentLoadingException, exceptionMessage=Exception loading segment[temperature_stream_2016-06-04T00:00:00.000Z_2016-06-05T00:00:00.000Z_2016-08-29T07:43:24.379Z], segment=DataSegment{size=100384, shardSpec=HashBasedNumberedShardSpec{partitionNum=0, partitions=1, partitionDimensions=[]}, metrics=[count_events, temperature, min_temp, max_temp], dimensions=[timestamp, sensorId, sensorName, sensorLat, sensorLong], version='2016-08-29T07:43:24.379Z', loadSpec={type=s3_zip, bucket=servian-mel-druid-storage, key=druid/segments/temperature_stream/2016-06-04T00:00:00.000Z_2016-06-05T00:00:00.000Z/2016-08-29T07:43:24.379Z/0/index.zip}, interval=2016-06-04T00:00:00.000Z/2016-06-05T00:00:00.000Z, dataSource='temperature_stream', binaryVersion='9'}}

io.druid.segment.loading.SegmentLoadingException: Exception loading segment[temperature_stream_2016-06-04T00:00:00.000Z_2016-06-05T00:00:00.000Z_2016-08-29T07:43:24.379Z]

        at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:309) ~[druid-server-0.9.1.1.jar:0.9.1.1]

        at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:350) [druid-server-0.9.1.1.jar:0.9.1.1]

        at io.druid.server.coordination.SegmentChangeRequestLoad.go(SegmentChangeRequestLoad.java:44) [druid-server-0.9.1.1.jar:0.9.1.1]

        at io.druid.server.coordination.ZkCoordinator$1.childEvent(ZkCoordinator.java:152) [druid-server-0.9.1.1.jar:0.9.1.1]

        at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:522) [curator-recipes-2.10.0.jar:?]

        at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:516) [curator-recipes-2.10.0.jar:?]

        at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93) [curator-framework-2.10.0.jar:?]

        at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [guava-16.0.1.jar:?]

        at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:85) [curator-framework-2.10.0.jar:?]

        at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:514) [curator-recipes-2.10.0.jar:?]

        at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) [curator-recipes-2.10.0.jar:?]

        at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:772) [curator-recipes-2.10.0.jar:?]

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_91]

        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_91]

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_91]

        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_91]

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]

        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]

Caused by: io.druid.segment.loading.SegmentLoadingException: No such file or directory

        at io.druid.storage.s3.S3DataSegmentPuller.getSegmentFiles(S3DataSegmentPuller.java:238) ~[?:?]

        at io.druid.storage.s3.S3LoadSpec.loadSegment(S3LoadSpec.java:62) ~[?:?]

        at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:143) ~[druid-server-0.9.1.1.jar:0.9.1.1]

        at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:95) ~[druid-server-0.9.1.1.jar:0.9.1.1]

        at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:152) ~[druid-server-0.9.1.1.jar:0.9.1.1]

        at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:305) ~[druid-server-0.9.1.1.jar:0.9.1.1]

        ... 18 more

Caused by: java.io.IOException: No such file or directory

        at java.io.UnixFileSystem.createFileExclusively(Native Method) ~[?:1.8.0_91]

        at java.io.File.createTempFile(File.java:2024) ~[?:1.8.0_91]

        at java.io.File.createTempFile(File.java:2070) ~[?:1.8.0_91]

        at com.metamx.common.CompressionUtils.unzip(CompressionUtils.java:149) ~[java-util-0.27.9.jar:?]

        at io.druid.storage.s3.S3DataSegmentPuller.getSegmentFiles(S3DataSegmentPuller.java:207) ~[?:?]

        at io.druid.storage.s3.S3LoadSpec.loadSegment(S3LoadSpec.java:62) ~[?:?]

        at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:143) ~[druid-server-0.9.1.1.jar:0.9.1.1]

        at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:95) ~[druid-server-0.9.1.1.jar:0.9.1.1]

        at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:152) ~[druid-server-0.9.1.1.jar:0.9.1.1]

        at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:305) ~[druid-server-0.9.1.1.jar:0.9.1.1]

        ... 18 more

                                                                                                                                                                 

Fangjin Yang

unread,
Aug 31, 2016, 8:17:55 PM8/31/16
to Druid User
Hi Prerna, in your historical configuration there should be a segmentCache directory where segments get downloaded locally. Does that directory exist? Do you have proper permissions for it?

Prerna Sharma

unread,
Sep 2, 2016, 12:28:00 AM9/2/16
to Druid User
Hi Fangjin,

I was able to resolve the issue. It is not the segment-cache. It was again complaining about the temp directory not existing.
Thanks a lot :)




You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/io0pyw4Iyj4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
Prerna Sharma | Consultant | m: +61 452 285 925 | p: 03 9081 3700
Level 11, 45 William Street, Melbourne VIC 3000

Des Sindatry

unread,
Nov 15, 2016, 4:23:42 PM11/15/16
to Druid User
Hi Prerna,

How did you fix it. I am seeing exactly the same issue from my application.

Regards,
Des

Nishant Bangarwa

unread,
Nov 16, 2016, 6:27:50 AM11/16/16
to Druid User
Hi Des, 
IIRC, above issue was resolved by setting -Djava.io.tmpdir to a directory with read/write permission in your vm arguments.


--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.

vishnu rao

unread,
Feb 10, 2017, 5:54:27 AM2/10/17
to Druid User
hi

i looked at the code and the code which throws exception is .......

-------------------
private
static File createNewBasePersistDirectory()
{
return Files.createTempDir();
}
-------------------

i set Java.io.tmpdir = /tmp/druid/mm  (which did not exist and gave permissions 777 )

Now , should not the code create it... i.e. was wondering why google library fails to create the temp dir.

For me... i resolved it by creating it manually.... which should not be the case.. the code should do it...

sounds like a bug..

java.lang.IllegalStateException: Failed to create directory within 10000 attempts (tried 1486723753337-0 to 1486723753337-9999)
	at com.google.common.io.Files.createTempDir(Files.java:600) ~[guava-16.0.1.jar:?]


with regards,
ch Vishnu
Message has been deleted
Message has been deleted
Message has been deleted

Kiran Jagtap

unread,
Nov 2, 2018, 10:07:59 AM11/2/18
to Druid User
Hi Nishant,
I'm using druid 0.12.3 in cluster mode with three node cluster, I'm getting same error of 

"java.lang.IllegalStateException: Failed to create directory within 10000 attempts (tried 1541165112739-0 to 1541165112739-9999)"

I have setup "-Djava.io.tmpdir=tmp" and given 777 permissions still error present no luck to get data ingestion working ...

Any help would be greatly appreciated..

Thanks,
Kiran
Reply all
Reply to author
Forward
0 new messages