io.druid.segment.loading.SegmentLoadingException: Exception loading segment

1,966 views
Skip to first unread message

Geoff Berger

unread,
Nov 25, 2015, 4:28:42 PM11/25/15
to Druid User
I'm trying to spin up a historical node, but I repeatedly keep getting a SegmentLoadingException. For example:

2015-11-25T20:40:57,339 ERROR [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Failed to load segment for dataSource: {class=io.druid.server.coordination.ZkCoordinator, exceptionType=class io.druid.segment.loading.SegmentLoadingException, exceptionMessage=Exception loading segment[messages_2015-11-17T15:00:00.000Z_2015-11-17T16:00:00.000Z_2015-11-17T15:00:00.000Z], segment=DataSegment{size=14994, shardSpec=NoneShardSpec, metrics=[count], dimensions=[dimension1, dimension2, dimension3], version='2015-11-17T15:00:00.000Z', loadSpec={type=local, path=/tmp/druid/localStorage/messages/2015-11-17T15:00:00.000Z_2015-11-17T16:00:00.000Z/2015-11-17T15:00:00.000Z/0/index.zip}, interval=2015-11-17T15:00:00.000Z/2015-11-17T16:00:00.000Z, dataSource='messages', binaryVersion='9'}}
io.druid.segment.loading.SegmentLoadingException: Exception loading segment[messages_2015-11-17T15:00:00.000Z_2015-11-17T16:00:00.000Z_2015-11-17T15:00:00.000Z]
at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:146) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:171) [druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.SegmentChangeRequestLoad.go(SegmentChangeRequestLoad.java:42) [druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.BaseZkCoordinator$1.childEvent(BaseZkCoordinator.java:115) [druid-server-0.8.1.jar:0.8.1]
at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:516) [curator-recipes-2.8.0.jar:?]
at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:510) [curator-recipes-2.8.0.jar:?]
at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92) [curator-framework-2.8.0.jar:?]
at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [guava-16.0.1.jar:?]
at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:84) [curator-framework-2.8.0.jar:?]
at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:508) [curator-recipes-2.8.0.jar:?]
at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) [curator-recipes-2.8.0.jar:?]
at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:759) [curator-recipes-2.8.0.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_31]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_31]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_31]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_31]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_31]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_31]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_31]
Caused by: java.lang.IllegalArgumentException: Instantiation of [simple type, class io.druid.segment.loading.LocalLoadSpec] value failed: [/tmp/druid/localStorage/messages/2015-11-17T15:00:00.000Z_2015-11-17T16:00:00.000Z/2015-11-17T15:00:00.000Z/0/index.zip] does not exist
at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:2774) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:2700) ~[jackson-databind-2.4.4.jar:2.4.4]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:140) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:93) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:151) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:142) ~[druid-server-0.8.1.jar:0.8.1]
... 18 more
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Instantiation of [simple type, class io.druid.segment.loading.LocalLoadSpec] value failed: [/tmp/druid/localStorage/messages/2015-11-17T15:00:00.000Z_2015-11-17T16:00:00.000Z/2015-11-17T15:00:00.000Z/0/index.zip] does not exist
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapException(StdValueInstantiator.java:405) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:234) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:167) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:398) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1064) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:264) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:156) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:126) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:84) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:132) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:41) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:2769) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:2700) ~[jackson-databind-2.4.4.jar:2.4.4]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:140) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:93) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:151) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:142) ~[druid-server-0.8.1.jar:0.8.1]
... 18 more
Caused by: java.lang.IllegalArgumentException: [/tmp/druid/localStorage/messages/2015-11-17T15:00:00.000Z_2015-11-17T16:00:00.000Z/2015-11-17T15:00:00.000Z/0/index.zip] does not exist
at com.google.api.client.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:119) ~[google-http-client-1.15.0-rc.jar:?]
at com.google.api.client.util.Preconditions.checkArgument(Preconditions.java:69) ~[google-http-client-1.15.0-rc.jar:?]
at io.druid.segment.loading.LocalLoadSpec.<init>(LocalLoadSpec.java:49) ~[druid-server-0.8.1.jar:0.8.1]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_31]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_31]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_31]
at java.lang.reflect.Constructor.newInstance(Constructor.java:408) ~[?:1.8.0_31]
at com.fasterxml.jackson.databind.introspect.AnnotatedConstructor.call(AnnotatedConstructor.java:125) ~[jackson-databind-2.4.4.jar:0.8.1]
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:230) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:167) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:398) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1064) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:264) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:156) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:126) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:84) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:132) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:41) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:2769) ~[jackson-databind-2.4.4.jar:2.4.4]
at com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:2700) ~[jackson-databind-2.4.4.jar:2.4.4]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:140) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:93) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:151) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:142) ~[druid-server-0.8.1.jar:0.8.1]
... 18 more

shortly followed by:

2015-11-25T20:40:58,087 ERROR [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Failed to load segment for dataSource: {class=io.druid.server.coordination.ZkCoordinator, exceptionType=class io.druid.segment.loading.SegmentLoadingException, exceptionMessage=Exception loading segment[messages_2015-11-17T09:00:00.000Z_2015-11-17T10:00:00.000Z_2015-11-17T09:00:00.000Z], segment=DataSegment{size=12508, shardSpec=NoneShardSpec, metrics=[count], dimensions=[dimension1, dimension2, dimension3], version='2015-11-17T09:00:00.000Z', loadSpec={type=local, path=/tmp/druid/localStorage/messages/2015-11-17T09:00:00.000Z_2015-11-17T10:00:00.000Z/2015-11-17T09:00:00.000Z/0/index.zip}, interval=2015-11-17T09:00:00.000Z/2015-11-17T10:00:00.000Z, dataSource='messages', binaryVersion='9'}}
io.druid.segment.loading.SegmentLoadingException: Exception loading segment[messages_2015-11-17T09:00:00.000Z_2015-11-17T10:00:00.000Z_2015-11-17T09:00:00.000Z]
at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:146) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:171) [druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.SegmentChangeRequestLoad.go(SegmentChangeRequestLoad.java:42) [druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.BaseZkCoordinator$1.childEvent(BaseZkCoordinator.java:115) [druid-server-0.8.1.jar:0.8.1]
at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:516) [curator-recipes-2.8.0.jar:?]
at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:510) [curator-recipes-2.8.0.jar:?]
at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92) [curator-framework-2.8.0.jar:?]
at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [guava-16.0.1.jar:?]
at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:84) [curator-framework-2.8.0.jar:?]
at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:508) [curator-recipes-2.8.0.jar:?]
at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) [curator-recipes-2.8.0.jar:?]
at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:759) [curator-recipes-2.8.0.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_31]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_31]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_31]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_31]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_31]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_31]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_31]
Caused by: io.druid.segment.loading.SegmentLoadingException: /path/to/druid/data/zk_druid/messages/2015-11-17T09:00:00.000Z_2015-11-17T10:00:00.000Z/2015-11-17T09:00:00.000Z/0/index.drd (No such file or directory)
at io.druid.segment.loading.MMappedQueryableIndexFactory.factorize(MMappedQueryableIndexFactory.java:40) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:94) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:151) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:142) ~[druid-server-0.8.1.jar:0.8.1]
... 18 more
Caused by: java.io.FileNotFoundException: /path/to/druid/data/zk_druid/messages/2015-11-17T09:00:00.000Z_2015-11-17T10:00:00.000Z/2015-11-17T09:00:00.000Z/0/index.drd (No such file or directory)
at java.io.FileInputStream.open(Native Method) ~[?:1.8.0_31]
at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[?:1.8.0_31]
at io.druid.segment.SegmentUtils.getVersionFromDir(SegmentUtils.java:24) ~[druid-api-0.3.9.jar:0.8.1]
at io.druid.segment.IndexIO.loadIndex(IndexIO.java:165) ~[druid-processing-0.8.1.jar:0.8.1]
at io.druid.segment.loading.MMappedQueryableIndexFactory.factorize(MMappedQueryableIndexFactory.java:37) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:94) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:151) ~[druid-server-0.8.1.jar:0.8.1]
at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:142) ~[druid-server-0.8.1.jar:0.8.1]
... 18 more

These errors happen continuously except using different times for each segment. Here is my historical/runtime.properties:

druid.host=<host>
druid.port=<port>
druid.service=<service>

druid.historical.cache.useCache=true
druid.historical.cache.populateCache=true

# Our intermediate buffer is also very small so longer topNs will be slow.
# In prod: set sizeBytes = 512mb
druid.processing.buffer.sizeBytes=1073741824
# We can only 1 scan segment in parallel with these configs.
# In prod: set numThreads = # cores - 1
druid.processing.numThreads=8

# maxSize should reflect the performance you want.
# Druid memory maps segments.
# memory_for_segments = total_memory - heap_size - (processing.buffer.sizeBytes * (processing.numThreads+1)) - JVM overhead (~1G)
# The greater the memory/disk ratio, the better performance you should see
druid.segmentCache.locations=[{"path": "/path/to/druid/data/zk_druid", "maxSize"\: 300000000000}]
druid.monitoring.monitors=["io.druid.server.metrics.HistoricalMetricsMonitor", "com.metamx.metrics.JvmMonitor"]

druid.server.http.numThreads=16
druid.server.maxSize=300000000000

With the first exception, there appears to be no /tmp/druid at all, when I don't believe I have any configuration set to that path. With the second exception where it's looking for index.drd, the directory only contains a file called downloadStartMarker that is empty.

Is there any sort of configuration I may be missing that is preventing the zip and index.drd file from appearing at the proper locations, respectively? There may have been something that went wrong when importing this data, which was done by sending events to Kafka for Druid to ingest. However, I'm able to query Druid with no issues and I see segments getting into the database used for metadata storage. Furthermore, I'm using s3 for deep storage, but the logs don't indicate anything going on with s3.

Any help is greatly appreciated. Let me know if other information is needed here. Thank you!

- Geoff

Fangjin Yang

unread,
Nov 28, 2015, 5:47:03 PM11/28/15
to Druid User
You have local filesystem set up as your deep storage and probably restarted the machine, removing the segment from /tmp

Geoff Berger

unread,
Nov 30, 2015, 5:00:31 PM11/30/15
to Druid User
Thanks Fangjin for responding. There's probably something I am missing, but I don't believe I have the local filesystem set for local storage.

Here is my common.runtime.properties:

# Extensions (no deep storage model is listed - using local fs for deep storage - not recommended for production)
# Also, for production to use mysql add, "io.druid.extensions:mysql-metadata-storage"
druid.extensions.coordinates=["io.druid.extensions:druid-examples","io.druid.extensions:druid-kafka-eight","io.druid.extensions:druid-s3-extensions"]
druid.extensions.localRepository=extensions-repo

druid.request.logging.type=emitter
druid.request.logging.feed=druid_requests

# Zookeeper
druid.zk.service.host=<zkconnection>
druid.zk.paths.base=<zkbase>

druid.discovery.curator.path=/druid/discovery

# Metadata Storage (use something like mysql in production by uncommenting properties below)
# by default druid will use derby
druid.extensions.coordinates=[\"io.druid.extensions:<dbextension>"]
druid.metadata.storage.type=<dbtype>
druid.metadata.storage.connector.connectURI=<dbconnectionstring>
druid.metadata.storage.connector.user=<user>
druid.metadata.storage.connector.password=<password>

# Deep storage (local filesystem for examples - don't use this in production)
#druid.storage.type=local
#druid.storage.storageDirectory=/tmp/druid/localStorage
druid.storage.type=s3
druid.storage.bucket=<bucketname>
druid.s3.accessKey=<accesskey>
druid.s3.secretKey=<secretkey>

# Query Cache (we use a simple 10mb heap-based local cache on the broker)
druid.cache.type=local
druid.cache.sizeInBytes=10000000

# Indexing service discovery
druid.selectors.indexing.serviceName=overlord

# Monitoring (disabled for examples, if you enable SysMonitor, make sure to include sigar jar in your cp)
# druid.monitoring.monitors=["com.metamx.metrics.SysMonitor","com.metamx.metrics.JvmMonitor"]

# Metrics logging (disabled for examples - change this to logging or http in production)
#druid.emitter=noop
druid.emitter=logging
#druid.emitter.http.recipientBaseUrl=

Here is my historical runtime.properties:

druid.host=<host>
druid.port=<port>
druid.service=druid/corp/historical

druid.historical.cache.useCache=true
druid.historical.cache.populateCache=true

# Our intermediate buffer is also very small so longer topNs will be slow.
# In prod: set sizeBytes = 512mb
druid.processing.buffer.sizeBytes=1073741824
# We can only 1 scan segment in parallel with these configs.
# In prod: set numThreads = # cores - 1
druid.processing.numThreads=8

# maxSize should reflect the performance you want.
# Druid memory maps segments.
# memory_for_segments = total_memory - heap_size - (processing.buffer.sizeBytes * (processing.numThreads+1)) - JVM overhead (~1G)
# The greater the memory/disk ratio, the better performance you should see
druid.segmentCache.locations=[{"path": "/path/to/druid/data/zk_druid", "maxSize"\: 300000000000}]
druid.monitoring.monitors=["io.druid.server.metrics.HistoricalMetricsMonitor", "com.metamx.metrics.JvmMonitor"]

druid.server.http.numThreads=16
druid.server.maxSize=300000000000

These errors were happening immediately after data was coming in from kafka, so I don't know if that's related to the files missing from /tmp.

In the meantime, I'll keep poking around and see if I have anything misconfigured. Thanks for your help.

- Geoff

David Lim

unread,
Dec 2, 2015, 9:07:46 PM12/2/15
to Druid User
Hey Geoff,

I see your problem - you have druid.extensions.coordinates defined twice in your property file:


druid.extensions.coordinates=[
"io.druid.extensions:druid-examples","io.druid.extensions:druid-kafka-eight","io.druid.extensions:druid-s3-extensions"]

druid.extensions.coordinates=[
\"io.druid.extensions:<dbextension>"]

The second one is overwriting the first, preventing the s3 extension from being loaded, so Druid defaults to using local for deep storage.

Geoff Berger

unread,
Dec 3, 2015, 1:45:33 PM12/3/15
to Druid User
Thanks David. I noticed that afterwards, reran everything again (including importing data to a new kafka topic), and I'm still seeing the same thing.

I'll keep playing around but any suggestions/help is welcomed.

David Lim

unread,
Dec 3, 2015, 2:42:47 PM12/3/15
to Druid User
Okay great. You should see a log message like this on startup: INFO [main] io.druid.storage.s3.S3DataSegmentPusher - Configured S3 as deep storage

You should not be seeing this: INFO [main] io.druid.segment.loading.LocalDataSegmentPusher - Configured local filesystem as deep storage

Geoff Berger

unread,
Dec 3, 2015, 6:16:03 PM12/3/15
to Druid User
So after playing around with it for a bit, I was able to sort of getting it working, To answer your question though, I am now seeing io.druid.storage.s3.S3DataSegmentPusher. However, I am trying to import data via the kafka firehose. I am sending to kafka around 600,000 events. What's odd is I'm not seeing any activity from the Realtime node, but noticed the Coordinator node logs outputting:

2015-12-03T23:06:01,143 WARN [Coordinator-Exec--0] io.druid.server.coordinator.rules.LoadRule - Not enough [_default_tier] servers or node capacity to assign segment[messages_2015-12-03T22:05:00.000Z_2015-12-03T22:10:00.000Z_2015-12-03T22:05:00.000Z]! Expected Replicants[2]
2015-12-03T23:06:01,144 WARN [Coordinator-Exec--0] io.druid.server.coordinator.rules.LoadRule - Not enough [_default_tier] servers or node capacity to assign segment[messages_2015-12-03T22:00:00.000Z_2015-12-03T22:05:00.000Z_2015-12-03T22:00:00.000Z]! Expected Replicants[2]

Something tells me, the issues I'm seeing has to do with the format of the data being ingested by kafka as JSON. On a side note, should there be any issues if nested JSON objects or an array within the JSON object? i.e.:

{
  "info": {"name": "meh", "moreInfo": ["one", "two"]}
  "someList": ["foo", "bar", "baz"]
}

David Lim

unread,
Dec 3, 2015, 11:04:54 PM12/3/15
to Druid User
Awesome! Glad you got the S3 deep storage working.

Regarding the coordinator logs, by default Druid is configured to load 2 replications of each segment, making sure that they're on different historical nodes, for high availability. Likely you're seeing that message because you only have one historical node running. It's not a big deal, but if you want to make the warning go away, you can modify the default rule set using the coordinator console.

I think your hunch is right and it is indeed a data format issue. Druid supports multi-value dimensions, but Druid does not supported nested dimensions which should be flattened before being read by Druid. In other words, from your example:

"someList": ["foo", "bar", "baz"] => multi-value dimension which is valid

"info": {"name": "meh", "moreInfo": ["one", "two"]} => nested dimension, should be flattened to something like:

{
  "info_name": "meh",
  "info_moreInfo": ["one", "two"]
}

When your realtime node starts accepting data, you'll see a log message that says something like "Announcing segment for [interval]". If you're not seeing this, Druid is probably still throwing away your data because of format or the message timestamp is outside of the acceptable time range (window period).
Reply all
Reply to author
Forward
0 new messages