Druid 25.0.0 Compaction Task Issue

169 views
Skip to first unread message

Diganta Mukherjee

unread,
Aug 28, 2023, 1:55:57 AM8/28/23
to Druid User
Hi Team,
I have recently upgraded to Druid version 25.0.0 and post that I am facing an issue with compaction jobs(both manual and auto-compaction) which were not starting up even though the Overlord was trying to assign it to a middlemanager. The task is never starting up, and after the configured timeout, its simply failing. Upon looking at Overlord logs I could see it reported that the task failed to start:


org.apache.druid.indexing.overlord.RemoteTaskRunner - Task assignment timed out on worker [worker-endpoint], never ran task [task-id]! Timeout: (300000 >= PT5M)!: {class=org.apache.druid.indexing.overlord.RemoteTaskRunner}


On further digging into the middlemanager logs I could see the below exception in parsing the compaction config:


 2023-08-23T09:35:57,761 ERROR [TaskMonitorCache-0] org.apache.curator.framework.recipes.cache.PathChildrenCache -
com.fasterxml.jackson.databind.exc.InvalidTypeIdException: Could not resolve type id 'compaction' as a subtype of `org.apache.druid.segment.indexing.TuningConfig`: known type ids = [KafkaTuningConfig, index, index_parallel, kafka, realtime] (for POJO property 'tuningConfig')
 at [Source: (byte[])"{"type":"compact","id":"myid","resource":{"availabilityGroup":"ag","requiredCapacity":1},"dataSource":"ds","ioConfig":{"type":"compact","inputSpec":{"type":"interval","interval":"2023-08-21T00:00:00.000Z/2023-08-22T00:00:00.000Z","sha256OfSortedSegmentIds":null},"dropExisting":false},"dimensionsSpec":null,"transform"[truncated 1658 bytes]; line: 1, column: 666] (through reference chain: org.apache.druid.indexing.common.task.CompactionTask["tuningConfig"])
        at com.fasterxml.jackson.databind.exc.InvalidTypeIdException.from(InvalidTypeIdException.java:43) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.DeserializationContext.invalidTypeIdException(DeserializationContext.java:1758) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.DeserializationContext.handleUnknownTypeId(DeserializationContext.java:1265) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._handleUnknownTypeId(TypeDeserializerBase.java:290) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._findDeserializer(TypeDeserializerBase.java:162) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:97) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:254) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:527) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:528) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:417) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1287) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:326) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:194) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:161) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:130) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:97) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:254) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:68) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4202) ~[jackson-databind-2.10.2.jar:2.10.2]
        at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3266) ~[jackson-databind-2.10.2.jar:2.10.2]
        at org.apache.druid.indexing.worker.WorkerTaskMonitor$1.childEvent(WorkerTaskMonitor.java:165) ~[druid-indexing-service-0.20.0.jar:0.20.0]
        at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:538) [curator-recipes-4.3.0.jar:4.3.0]
        at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:532) [curator-recipes-4.3.0.jar:4.3.0]
        at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100) [curator-framework-4.3.0.jar:4.3.0]
        at org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) [curator-client-4.3.0.jar:?]
        at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92) [curator-framework-4.3.0.jar:4.3.0]
        at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:530) [curator-recipes-4.3.0.jar:4.3.0]
        at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) [curator-recipes-4.3.0.jar:4.3.0]
        at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:808) [curator-recipes-4.3.0.jar:4.3.0]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_262]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_262]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_262]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_262]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_262]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_262]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262]



The compaction payload on the Druid UI console also matches the type which is reported in the exception:

"tuningConfig": {
    "type": "compaction",

Even when I submit a manual compaction job with the below tuning config, somehow the actual submitted job on the console shows the type as compaction

 "tuningConfig": {
        "type": "index_parallel",
        "maxRowsPerSegment": 5000000,
        "maxRowsInMemory": 5000000,
        "maxNumSegmentsToMerge": 100
    }

Below is the my compaction spec:

{
    "dataSource": "my-datasource",
    "taskPriority": 25,
    "inputSegmentSizeBytes": 100000000000000,
    "maxRowsPerSegment": null,
    "skipOffsetFromLatest": "PT31H",
    "tuningConfig":
    {
        "maxRowsInMemory": null,
        "appendableIndexSpec": null,
        "maxBytesInMemory": null,
        "maxTotalRows": null,
        "splitHintSpec": null,
        "partitionsSpec":
        {
            "type": "dynamic",
            "maxRowsPerSegment": 5000000,
            "maxTotalRows": null
        },
        "indexSpec": null,
        "indexSpecForIntermediatePersists": null,
        "maxPendingPersists": null,
        "pushTimeout": null,
        "segmentWriteOutMediumFactory": null,
        "maxNumConcurrentSubTasks": 5,
        "maxRetry": null,
        "taskStatusCheckPeriodMs": null,
        "chatHandlerTimeout": null,
        "chatHandlerNumRetries": null,
        "maxNumSegmentsToMerge": null,
        "totalNumMergeTasks": null,
        "maxColumnsToMerge": 10,
        "forceGuaranteedRollup": false,
        "type": "index_parallel"
    },
    "granularitySpec": null,
    "dimensionsSpec": null,
    "metricsSpec": null,
    "transformSpec": null,
    "ioConfig": null,
    "taskContext": null
}

Any pointers on what could be happening here? Is this workaround for this or any config I need to set correctly?

John Kowtko

unread,
Sep 22, 2023, 3:11:27 PM9/22/23
to Druid User
When i run a manual compaction and pull the spec from an existing compaction job, the tuningConfig/type is "compaction", not "index_parallel" ... so maybe try that in the spec that you manually submit.

If the auto-compaction is not working, then I suggest deleting the auto-compact spec and recreating it ... possibly the upgrade has some differences in the spec format that it is tripping up on ...

Reply all
Reply to author
Forward
0 new messages