ERROR [task-runner-0-priority-0] io.druid.indexing.common.task.MergeTaskBase

154 views
Skip to first unread message

naveen kavuri

unread,
Nov 15, 2018, 11:28:46 AM11/15/18
to Druid User
Hi,

We have been experiencing this issue but couldn't seem to find the root cause of it. All the indexing tasks seems to be failing 

  • Logs:
2018-11-09T15:45:10,861 ERROR [task-runner-0-priority-0] io.druid.indexing.common.task.MergeTaskBase - Exception merging[test-requests]: {class=io.druid.indexing.common.task.MergeTaskBase, exceptionType=class java.lang.IllegalArgumentException, exceptionMessage=Instantiation of [simple type, class io.druid.segment.loading.LocalLoadSpec] value failed: [/data/druid/deep/test-requests/2018-09-21T11:00:00.000Z_2018-10-31T14:00:00.000Z/2018-10-31T14:21:43.268Z/0/index.zip] does not exist, interval=2018-09-21T11:00:00.000Z/2018-11-09T15:00:00.000Z}
java.lang.IllegalArgumentException: Instantiation of [simple type, class io.druid.segment.loading.LocalLoadSpec] value failed: [/data/druid/deep/test-requests/2018-09-21T11:00:00.000Z_2018-10-31T14:00:00.000Z/2018-10-31T14:21:43.268Z/0/index.zip] does not exist
at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:2774) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:2700) ~[jackson-databind-2.4.6.jar:2.4.6]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.loadInLocation(SegmentLoaderLocalCacheManager.java:205) ~[druid-server-0.10.1.jar:0.10.1]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.loadInLocationWithStartMarker(SegmentLoaderLocalCacheManager.java:195) ~[druid-server-0.10.1.jar:0.10.1]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.loadSegmentWithRetry(SegmentLoaderLocalCacheManager.java:154) ~[druid-server-0.10.1.jar:0.10.1]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:130) ~[druid-server-0.10.1.jar:0.10.1]
at io.druid.indexing.common.TaskToolbox.fetchSegments(TaskToolbox.java:214) ~[druid-indexing-service-0.10.1.jar:0.10.1]
at io.druid.indexing.common.task.MergeTaskBase.run(MergeTaskBase.java:155) [druid-indexing-service-0.10.1.jar:0.10.1]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.10.1.jar:0.10.1]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.10.1.jar:0.10.1]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_144]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_144]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_144]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Instantiation of [simple type, class io.druid.segment.loading.LocalLoadSpec] value failed: [/data/druid/deep/test-requests/2018-09-21T11:00:00.000Z_2018-10-31T14:00:00.000Z/2018-10-31T14:21:43.268Z/0/index.zip] does not exist
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapException(StdValueInstantiator.java:405) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:234) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:167) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:398) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1064) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:264) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:156) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:126) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:84) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:132) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:41) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:2769) ~[jackson-databind-2.4.6.jar:2.4.6]
... 13 more
Caused by: java.lang.IllegalArgumentException: [/data/druid/deep/test-requests/2018-09-21T11:00:00.000Z_2018-10-31T14:00:00.000Z/2018-10-31T14:21:43.268Z/0/index.zip] does not exist
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:148) ~[guava-16.0.1.jar:?]
at io.druid.segment.loading.LocalLoadSpec.<init>(LocalLoadSpec.java:51) ~[druid-server-0.10.1.jar:0.10.1]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_144]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_144]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_144]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_144]
at com.fasterxml.jackson.databind.introspect.AnnotatedConstructor.call(AnnotatedConstructor.java:125) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:230) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:167) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:398) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1064) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:264) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:156) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:126) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:84) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:132) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:41) ~[jackson-databind-2.4.6.jar:2.4.6]
at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:2769) ~[jackson-databind-2.4.6.jar:2.4.6]
... 13 more
2018-11-09T15:45:10,866 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [merge_test-requests_bdd572cb7e9d4d83752bb158970d34710efe6685_2018-11-09T15:44:59.482Z] status changed to [FAILED].
2018-11-09T15:45:10,868 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "merge_test-requests_bdd572cb7e9d4d83752bb158970d34710efe6685_2018-11-09T15:44:59.482Z",
  "status" : "FAILED",
  "duration" : 33
}
************************************************************************************************************************

I was able to see that segment is being created in another path and deep storage doesn't have that segment.

2018-11-12T14:00:55,147 INFO [main] io.druid.indexing.worker.executor.ExecutorLifecycle - Running with task: {
  "type" : "append",
  "id" : "merge_test-requests_75a016d378f18e97af34fc7f22f2894d3d1142e5_2018-11-12T14:00:44.001Z",
  "dataSource" : "test-requests",
  "segments" : [ {
    "dataSource" : "test-requests",
    "interval" : "2018-09-21T11:00:00.000Z/2018-10-31T14:00:00.000Z",
    "version" : "2018-10-31T14:21:43.268Z",
    "loadSpec" : {
      "type" : "local",
      "path" : "/data/druid/deep/test-requests/2018-09-21T11:00:00.000Z_2018-10-31T14:00:00.000Z/2018-10-31T14:21:43.268Z/0/index.zip"
    },
    "dimensions" : "env,host,site",
    "metrics" : "bytesSum,count,durationSum,unique_ips,unique_users",
    "shardSpec" : {
      "type" : "none"
    },
    "binaryVersion" : 9,
    "size" : 1525630,
    "identifier" : "test-requests_2018-09-21T11:00:00.000Z_2018-10-31T14:00:00.000Z_2018-10-31T14:21:43.268Z"
  }, {
    "dataSource" : "test-requests",
    "interval" : "2018-10-31T15:00:00.000Z/2018-10-31T16:00:00.000Z",
    "version" : "2018-10-31T15:00:00.000Z",
    "loadSpec" : {
      "type" : "local",
      "path" : "/auto/stage-data/druid/data/test-requests/2018-10-31T15:00:00.000Z_2018-10-31T16:00:00.000Z/2018-10-31T15:00:00.000Z/0/index.zip"

The deep storage mentioned in properties file is deep_storage_path => '/auto/stage-data/druid/data'.

Jihoon Son

unread,
Nov 15, 2018, 2:49:01 PM11/15/18
to druid...@googlegroups.com
Hi,

this error can happen if you have changed deep storage type after creating a segment. What deep storage are you using?

Jihoon

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/4f8f59ba-08b5-4386-b231-d2f74fa1748c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

naveen kavuri

unread,
Nov 15, 2018, 5:48:35 PM11/15/18
to Druid User
Hi Jihoon,

we are using MySQL db for deep-storage. We have not recently changed db for deep storage.we also observe that the path used for loading segments "/data/druid/deep/test-requests/" is actually the fail-over or default directory we created with "/auto/stage-data/druid/data" being the real path for deep storage. We have verified in the common properties that we have correct deep storage path.

Also i see the pattern "merge_test-usage_62fb5b9984597a6edd061644d937a71517e45bhf" repeated for all the failed tasks relating to a particular real-time node we have. Is it normal to have the same prefix for all the ingestion tasks. I did verify other data-sources and the prefixes slightly vary. I think its probably an ingestion task which is failing over and keeps resubmitting again. Is it possible ?

Please let me know.

Thanks!

Jihoon Son

unread,
Nov 15, 2018, 5:59:36 PM11/15/18
to druid...@googlegroups.com
Hi

I guess you're using MySQL for metadata store. The deep storage is the storage where segments are stored. Druid supports local file system, HDFS, AWS S3, and so on by default (http://druid.io/docs/latest/dependencies/deep-storage.html).
I think it's not the problem of the path, but type. Would you double check the deep storage type please?

Regarding realtime tasks, it's different depending on the realtime ingestion type. What method are you using? Is it Tranquility?

Jihoon

naveen kavuri

unread,
Nov 15, 2018, 6:04:47 PM11/15/18
to Druid User

Oops ! my bad. We are using local deep storage

druid.storage.type=local


druid.storage.storageDirectory=/auto/stage-data/druid/data

We have the property set in commoun.runtime properties.
Thanks !

naveen kavuri

unread,
Nov 15, 2018, 6:09:30 PM11/15/18
to Druid User
And adding to the below we are using kafka firehose-0.8 for data ingestion.

Jihoon Son

unread,
Nov 15, 2018, 6:10:55 PM11/15/18
to druid...@googlegroups.com
Thanks,

then the problem is that '/data/druid/deep/test-requests/2018-09-21T11:00:00.000Z_2018-10-31T14:00:00.000Z/2018-10-31T14:21:43.268Z/0/index.zip' doesn't exist. 
If you're running Druid on a multi-node cluster, you should use a file system which can be accessed by all middleManagers and historicals like NFS or HDFS.
If you're running on a single node cluster, please check that the file exists and the druid process has the right permission to read/write data to that path.

Jihoon


naveen kavuri

unread,
Nov 15, 2018, 6:25:50 PM11/15/18
to Druid User
We are currently running on a multi node cluster where middle manager is deployed on couple of nodes and historicals on different nodes with deep storage type as local. Lets say the segments are created with deep storage path as local , should middlemanager and historicals be running on the same machine to get rid of the error ?

Thanks

naveen kavuri

unread,
Nov 15, 2018, 6:37:23 PM11/15/18
to Druid User
Can you please elaborate on what do you mean by "then the problem is that '/data/druid/deep/test-requests/2018-09-21T11:00:00.000Z_2018-10-31T14:00:00.000Z/2018-10-31T14:21:43.268Z/0/index.zip' doesn't exist. ". Do you mean that middlemanager cannot handoff segments ?. My question is that if segments are created in "/auto/stage-data/" path,why is it looking at the "/data/druid/deep" path which is a default directory. I am relatively new to druid,so please bear with me

Thanks! 

Jonathan Wei

unread,
Nov 16, 2018, 3:52:17 PM11/16/18
to druid...@googlegroups.com
From this:

``` 
    "loadSpec" : {
      "type" : "local",
      "path" : "/data/druid/deep/test-requests/2018-09-21T11:00:00.000Z_2018-10-31T14:00:00.000Z/2018-10-31T14:21:43.268Z/0/index.zip"
    },

....

    "loadSpec" : {
      "type" : "local",
      "path" : "/auto/stage-data/druid/data/test-requests/2018-10-31T15:00:00.000Z_2018-10-31T16:00:00.000Z/2018-10-31T15:00:00.000Z/0/index.zip"
```

It looks like there was a change in the deep storage location at some point. If so, Druid still expects segments that were created before the location change to be at the old location (the "loadSpec" and other info is stored in the metadata DB, mysql in your case).

If all of the segments have been copied/moved to the new "/auto" location, you can edit the segment entries in the metadata store (should be in the "druid_segments" table) to point to the new path in their loadSpec instead.


Jonathan Wei

unread,
Nov 16, 2018, 3:53:44 PM11/16/18
to druid...@googlegroups.com
Alternatively, it might be easier just to create a softlink at `/data/druid/deep` to `/auto/stage-data/druid/data`.


Message has been deleted

naveen kavuri

unread,
Nov 16, 2018, 5:22:52 PM11/16/18
to Druid User
We did check both the deep storage path and we dont have that segment "2018-09-21T11:00:00.000Z_2018-10-31T14:00:00.000Z" created either in   
"path" : "/data/druid/deep/test-requests/2018-09-21T11:00:00.000Z_2018-10-31T14:00:00.000Z/2018-10-31T14:21:43.268Z/0/index.zip" 
or 
 "path" : "/auto/stage-data/druid/data/test-requests/2018-10-31T15:00:00.000Z_2018-10-31T16:00:00.000Z/2018-10-31T15:00:00.000Z/0/index.zip"
and we did check the  metadata DB which doesn't have the segment created.

Anyway, if the segment is not available and its still trying to load the segment,is there any way we can kill it manually and stop the process looking for it ?.We have persistent ingestion tasks failing and looking for the same segment over and over. Any workaround for this ?

Jonathan Wei

unread,
Nov 16, 2018, 6:03:20 PM11/16/18
to druid...@googlegroups.com
What does your task spec look like?

naveen kavuri

unread,
Nov 18, 2018, 11:17:25 PM11/18/18
to Druid User
Hello,
Below is the spec we are using currently. Thanks!
[
  {
    "dataSchema" : {
      "dataSource" : "test-requests",
      "parser" : {
        "type" : "string",
        "parseSpec" : {
          "format" : "json",
          "timestampSpec" : {
            "column" : "@timestamp",
            "format" : "auto"
          },
          "dimensionsSpec" : {
            "dimensions": ["site","env","host","method","statuscode","bytes","duration","resource_type","repo","clientip","timestamp","username"],
            "dimensionExclusions" : [],
            "spatialDimensions" : []
          }
        }
      },
      "metricsSpec" : [
        {
          "type" : "count",
          "name" : "count"
        },
        {
          "type":"longSum",
          "name":"bytesSum",
          "fieldName":"bytes"
        },
        {
          "type":"longSum",
          "name":"durationSum",
          "fieldName":"duration"
  },
        { 
          "type" : "hyperUnique", 
          "name" : "unique_ips", 
          "fieldName" : "clientip" 
        },
{
          "type" : "hyperUnique",
          "name" : "unique_users",
          "fieldName" : "username"
        }
      ],
      "granularitySpec" : {
        "type" : "uniform",
        "segmentGranularity" : "HOUR",
        "queryGranularity" : "NONE"
      }
    },
    "ioConfig" : {
      "type" : "realtime",
      "firehose": {
        "type": "kafka-0.8",
        "consumerProps": {
          "zookeeper.connect": "localhost:8084",
          "zookeeper.connection.timeout.ms" : "15000",
          "zookeeper.session.timeout.ms" : "15000",
          "zookeeper.sync.time.ms" : "5000",
          "group.id": "druid-test-requests",
          "fetch.message.max.bytes" : "1048586",
          "auto.offset.reset": "largest",
          "auto.commit.enable": "false"
        },
        "feed": "log.test.json"
      },
      "plumber": {
        "type": "realtime"
      }
    },
    "tuningConfig": {
      "type" : "realtime",
      "maxRowsInMemory": 500000,
      "intermediatePersistPeriod": "PT15m",
      "windowPeriod": "PT10m",
      "basePersistDirectory": "/index/realtime",
      "rejectionPolicy": {
        "type": "serverTime"
      }
    }
  }
]

Jonathan Wei

unread,
Nov 19, 2018, 1:03:42 PM11/19/18
to druid...@googlegroups.com
> 2018-11-09T15:45:10,861 ERROR [task-runner-0-priority-0] io.druid.indexing.common.task.MergeTaskBase - Exception merging[test-requests]: {class=io.druid.indexing.common.task.MergeTaskBase, exceptionType=class java.lang.IllegalArgumentException, exceptionMessage=Instantiation of [simple type, class io.druid.segment.loading.LocalLoadSpec] value failed: [/data/druid/deep/test-requests/2018-09-21T11:00:00.000Z_2018-10-31T14:00:00.000Z/2018-10-31T14:21:43.268Z/0/index.zip] does not exist, interval=2018-09-21T11:00:00.000Z/2018-11-09T15:00:00.000Z}

Hm, that's for a "index_realltime" task but the failure in the original message is from a "merge" task, what's the spec for this merge task that's being submitted?

In general, I would recommend searching through your metadata DB for segments that still have "/data/druid" in their load spec, the APIs here (http://druid.io/docs/latest/operations/api-reference.html#coordinator) may be useful. 

> Anyway, if the segment is not available and its still trying to load the segment,is there any way we can kill it manually and stop the process looking for it 

---------

As a side note, I would recommend upgrading to a later version of Druid like 0.12.3 if possible, I see that you're using some deprecated features right now:
- Kafka 0.8 consume and realtime tasks -> superseded by Kafka Indexing Service (http://druid.io/docs/latest/development/extensions-core/kafka-ingestion)
- Merge tasks -> Compaction tasks (http://druid.io/docs/latest/ingestion/compaction.html)


--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.

naveen kavuri

unread,
Nov 21, 2018, 9:50:06 AM11/21/18
to Druid User
Hey Jon, we actually found the segments in metadata(mysql) and we disabled the segments.I have a question though ! The value of the column "used", changing it from 1 to 0 would skip the segments from historical to be loaded ? 

Jihoon Son

unread,
Dec 4, 2018, 9:50:03 PM12/4/18
to Druid User
Yes, the segments with used=0 won't be loaded in historicals.

Jihoon

Reply all
Reply to author
Forward
0 new messages