Scheduled Compaction is not trigger any Task

189 views
Skip to first unread message

Alon Shoshani

unread,
Jul 29, 2019, 4:00:45 AM7/29/19
to Druid User
Hi,
I'm running Druid 0.14 with Kafka indexing service.
I'm trying to run schedule compaction, and I would like the segments will be around 500MB as recommended in the documentation.
My middle manager has 2 workers and the ratio is 0.5, means 1 compaction job at a time.
This is my configuration, and I have an interval time which has the following segments size,
I expect the first segment will be merged with the second one.
and the tasks just not started...
Any ideas?



{
    "compactionConfigs": [
        {
            "dataSource": "events",
            "keepSegmentGranularity": true,
            "taskPriority": 25,
            "inputSegmentSizeBytes": 536870912,
            "targetCompactionSizeBytes": 419430400,
            "maxRowsPerSegment": null,
            "maxNumSegmentsToCompact": 150,
            "skipOffsetFromLatest": "P3D",
            "tuningConfig": null,
            "taskContext": null
        }
    ],
    "compactionTaskSlotRatio": 0.5,
    "maxCompactionTaskSlots": 1
}

 

Screen Shot 2019-07-29 at 10.51.25 AM.png

Gaurav Bhatnagar

unread,
Jul 29, 2019, 3:21:59 PM7/29/19
to Druid User
default config allows 10% of tasks to be used by compaction job. In your case, try submitting job manually and monitor the log. Hope this helps.

Jihoon Son

unread,
Jul 29, 2019, 4:16:48 PM7/29/19
to Druid User
Hi Alon,

Currently in auto compaction, the compaction happens atomically per time chunk, which means, all segments in the same time chunk are compacted together or not.
From the screenshot you shared, this time chunk looks have 5 segments and their total size is greater than the configured "inputSegmentSizeBytes" which is 512MB.
You need to raise this to more than the total size of segments in each time chunk (probably 1.5 - 2GB?).

Jihoon

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/8bb64b43-b5fa-4b6b-be84-833310d17c0a%40googlegroups.com.

Alon Shoshani

unread,
Jul 30, 2019, 2:44:27 AM7/30/19
to druid...@googlegroups.com
Hi,
According to your claim, the coordinator will compact the same interval over and over ( I already had this issue )
because the sum of the segments will be always lower or equal than 2.5G.
According to the documentation here and the example:
look at "bar" datasource (, it has 3 segments and only the sum of two are 20MB so it will merge them and not the third one 
https://druid.apache.org/docs/0.15.0-incubating/design/coordinator.html#segment-search-policy

image.png
what do you say? any idea?


 

Alon Shoshani, R&D





You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/sTwbjtLm5Uc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACZfFK4ec89YwNfPLKb6KFWHfhxeuyXCx%3DA_NzP5iv5fvPaVhQ%40mail.gmail.com.

Alon Shoshani

unread,
Jul 30, 2019, 2:45:36 AM7/30/19
to Druid User
when I compact manually it works, but its on specific time frame, and i want an automatic process...

Alon Shoshani

unread,
Jul 30, 2019, 2:46:57 AM7/30/19
to Druid User

Hi,
According to your claim, the coordinator will compact the same interval over and over ( I already had this issue )
because the sum of the segments will be always lower or equal than 2.5G.
According to the documentation here and the example:
look at "bar" datasource (, it has 3 segments and only the sum of two are 20MB so it will merge them and not the third one 
https://druid.apache.org/docs/0.15.0-incubating/design/coordinator.html#segment-search-policy

image.png
what do you say? any idea?

On Monday, July 29, 2019 at 11:16:48 PM UTC+3, Jihoon Son wrote:
Hi Alon,

Currently in auto compaction, the compaction happens atomically per time chunk, which means, all segments in the same time chunk are compacted together or not.
From the screenshot you shared, this time chunk looks have 5 segments and their total size is greater than the configured "inputSegmentSizeBytes" which is 512MB.
You need to raise this to more than the total size of segments in each time chunk (probably 1.5 - 2GB?).

Jihoon

To unsubscribe from this group and stop receiving emails from it, send an email to druid...@googlegroups.com.

Venkat Poornalingam

unread,
Jul 30, 2019, 7:48:30 AM7/30/19
to druid...@googlegroups.com

I think it’s a documentation error. It should have been `targetCompactionSizeBytes` rather than `inputSegmentSizeBytes` in that documentation link you referred -  https://druid.apache.org/docs/0.15.0-incubating/design/coordinator.html#segment-search-policy.  Would await Jihoon’s reply though.

 

Same interval would be compacted again and again, until the optimal size is reached (`targetCompactionSizeBytes`).  If this is not the case, then may be its an issue.

 

From: "druid...@googlegroups.com" <druid...@googlegroups.com> on behalf of Alon Shoshani <al...@oribi.io>
Reply-To: "druid...@googlegroups.com" <druid...@googlegroups.com>
Date: Tuesday, July 30, 2019 at 12:14 PM
To: "druid...@googlegroups.com" <druid...@googlegroups.com>
Subject: Re: [druid-user] Re: Scheduled Compaction is not trigger any Task

 

Hi,
According to your claim, the coordinator will compact the same interval over and over ( I already had this issue )

because the sum of the segments will be always lower or equal than 2.5G.

According to the documentation here and the example:

look at "bar" datasource (, it has 3 segments and only the sum of two are 20MB so it will merge them and not the third one 
https://druid.apache.org/docs/0.15.0-incubating/design/coordinator.html#segment-search-policy

 

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/8bb64b43-b5fa-4b6b-be84-833310d17c0a%40googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/sTwbjtLm5Uc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACZfFK4ec89YwNfPLKb6KFWHfhxeuyXCx%3DA_NzP5iv5fvPaVhQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.

Alon Shoshani

unread,
Jul 30, 2019, 8:11:15 AM7/30/19
to Druid User
As I understand,
the segment scan algorithm, look at one interval and it's segments. if it found two segments in the same interval, which their sum is less than  inputSegmentSizeBytes 
it will merge them.
In the next iteration, if in the same interval there are candidates for merging, merge ! 
if not, look for the next interval which has segments we can merge(lower than  inputSegmentSizeBytes )

make sense?

To unsubscribe from this group and stop receiving emails from it, send an email to druid...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/sTwbjtLm5Uc/unsubscribe.

To unsubscribe from this group and all its topics, send an email to drui...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.

To unsubscribe from this group and stop receiving emails from it, send an email to druid...@googlegroups.com.

Venkat Poornalingam

unread,
Jul 30, 2019, 8:51:26 AM7/30/19
to druid...@googlegroups.com

Yes. It makes sense to me.

 

Note: I re-read the documentation and it is correct.

Image removed by sender.

Error! Filename not specified.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/8bb64b43-b5fa-4b6b-be84-833310d17c0a%40googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/sTwbjtLm5Uc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to drui...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACZfFK4ec89YwNfPLKb6KFWHfhxeuyXCx%3DA_NzP5iv5fvPaVhQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CAL_eGv79MMKAXd5%2BSeeMt9FVPVfZeV8rbOmWJqRi8G2xtG4MmA%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/50c352b5-7817-4338-bd46-80ffdd08ab71%40googlegroups.com.

Alon Shoshani

unread,
Jul 30, 2019, 12:00:41 PM7/30/19
to druid...@googlegroups.com
So is it a bug???
Who can handle it?

To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/MN2PR05MB66859C1B5423011E205DDF8AAEDC0%40MN2PR05MB6685.namprd05.prod.outlook.com.

Venkat Poornalingam

unread,
Jul 31, 2019, 1:52:48 AM7/31/19
to druid...@googlegroups.com

Hi Alon

 

I meant, there is no bug in Documentation.

 

But, yes, druid keeps checking for compaction possibilities when inputSegmentBytes condition is met and sometimes druid might land up doing compaction again.

 

This is expected to be fixed next edition of auto compaction.  

 

Thanks & Rgds

Venkat

Error! Filename not specified.

Alon Shoshani

unread,
Jul 31, 2019, 5:01:30 AM7/31/19
to druid...@googlegroups.com
But in my case the conditions met the tasks are not executed.
You can take a look in my first post and my scenario, i expect druid will compact my two first segments.

What do you think?

Venkat Poornalingam

unread,
Jul 31, 2019, 10:53:15 AM7/31/19
to druid...@googlegroups.com

Alon

 

Could you please let me know the number of segments for the given interval and the total size of the segments?

 

Is the total size of all the segments in the interval < inputSegmentSizeBytes? This is an expected condition.

Alon Shoshani

unread,
Jul 31, 2019, 3:39:39 PM7/31/19
to druid...@googlegroups.com

Hi Venkat,
the inputSegmentSizeBytes configured in the query is 530MB.
I have 7 segments in this interval of time, and the total size is 2.5G.
My first segment is 54MB and my second is 264MB, so I expect the compaction will find these two in the same interval and merge them.

If the total size of the segments is less than inputSegmentSizeBytes, druid keep executing the same
task of the same interval over and over, which cause slowness in the real-time query, so it makes no
sense the total size will be less than inputSegmentSizeBytes.

According to the documentation, if you read the "bar" Datasource example it merges the first two without the third one
(each segment is 10MB and the inputSegmentSizeBytes is 20MB)


Alon Shoshani, R&D




Venkat Poornalingam

unread,
Jul 31, 2019, 10:52:55 PM7/31/19
to druid...@googlegroups.com

Can you pleas set it to 2.5Gb and check if your segments are being auto compacted?

Alon Shoshani

unread,
Aug 1, 2019, 5:01:05 AM8/1/19
to druid...@googlegroups.com
Yes, when I set it to 2.5G, the segments are compacted.
But Druid Keeps executing the same task on the same interval again and again, why?

 
Alon Shoshani, R&D




Venkat Poornalingam

unread,
Aug 1, 2019, 7:11:24 AM8/1/19
to druid...@googlegroups.com

That’s a bug expected to be fixed in an upcoming version.

Alon Shoshani

unread,
Aug 1, 2019, 7:40:26 AM8/1/19
to druid...@googlegroups.com
Ok,
Maybe you can help me with another issue with druid 0.14.
I'm not able to send graphite metrics using graphite emitter  it's just ignoring the host I provide in the configuration,
while when the emitter is logging the metrics are logged to file....

I will appreciate it.
I also open bug on github, but no response...
https://groups.google.com/forum/#!searchin/druid-user/graphite$20emitter%7Csort:date/druid-user/ikI6aKkjmc0/ealRd390AgAJ

 
 

Alon Shoshani, R&D




Venkat Poornalingam

unread,
Aug 1, 2019, 7:42:14 AM8/1/19
to druid...@googlegroups.com

Alon,

 

Please post a new thread for this. So that some one who is working on graphite emitter could also reply.

Alon Shoshani

unread,
Aug 1, 2019, 7:48:02 AM8/1/19
to druid...@googlegroups.com
I already post twice...no comment...

 

Alon Shoshani, R&D




Venkat Poornalingam

unread,
Aug 1, 2019, 8:20:49 AM8/1/19
to druid...@googlegroups.com

Can you please try with Druid 0.15? I hope you have the right loading lists in the common_runtime_properties.

Alon Shoshani

unread,
Aug 1, 2019, 8:28:53 AM8/1/19
to druid...@googlegroups.com
For druid 0.15 we need to upgrade Kafka....Is it a known bug?

 

Alon Shoshani, R&D




Venkat Poornalingam

unread,
Aug 1, 2019, 9:12:46 AM8/1/19
to druid...@googlegroups.com

Alon, I’,m not sure about it. I haven’t tried  Graphite Emitter.

Reply all
Reply to author
Forward
0 new messages