druid.coordinator.merge.on

634 views
Skip to first unread message

Dave S

unread,
Jan 26, 2016, 3:21:42 AM1/26/16
to Druid User
Hi,

I need more details on co-ordinator config
druid.coordinator.merge.on

It is shortly described on http://druid.io/docs/latest/configuration/coordinator.html but not very helpful.

I turned on this flag and restarted co-ordinator, historical but unable to track whether it is working or not?

Do I need to submit tasks to kickoff segment merge? 

-Dave

Bingkun Guo

unread,
Jan 26, 2016, 11:16:50 AM1/26/16
to Druid User
Hi Dave,

When you enable druid.coordinator.merge.on , it means Coordinator will automatically send an Append Task to Overlord 
if it detects there are small segments that could be merged. You can tune mergeBytesLimit, mergeSegmentsLimit to
adjust how it identifies small segments. To verify it works, you could first generate a few small segments, then wait and see
if they get merged together after druid.coordinator.period.indexingPeriod.

Let me know if you have further questions.

Dave S

unread,
Feb 11, 2016, 1:16:10 AM2/11/16
to Druid User
Thanks Bingkun.

I have an update on this thread. We enabled the flag and restarted the services. We left the tuning parameters to default. But no luck. We do not see any append tasks in overlord console or segments merged even after druid.coordinator.period.indexingPeriod is over.

We generate 1 minute segments and would like them to be merged into bigger segments or 1 hour segments say after particular time interval.

How can we achieve this?

Thanks,
Dave

Dave S

unread,
Feb 11, 2016, 1:24:02 AM2/11/16
to Druid User
BTW, we are using linear shard spec for the segments. Please advise.

Fangjin Yang

unread,
Feb 11, 2016, 1:06:43 PM2/11/16
to Druid User
Hi Dave, just out of curiosity, why use 1minute segments?

Dave S

unread,
Feb 11, 2016, 10:26:02 PM2/11/16
to Druid User
Hi Fangjin,

We are using Druid as a real time time series DB and why wouldn't we use 1m granularity?

Fangjin

unread,
Feb 12, 2016, 12:23:25 AM2/12/16
to Druid User
Hi Dave, segment granularity controls partitioning and determines how data is sharded. Query granularity determines how data is rolled up. If you set query granularity to minute, your results will be explorable at a minutely granularity as the finest granularity. Data is queryable as soon as it enters Druid and has nothing to do with granularity. If you set segmentGranularity to hour, segments are handed off to historicals after an hour.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/60c4c4b6-91ec-4a75-af44-24cfbc06d14c%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Dave S

unread,
Feb 15, 2016, 12:42:48 AM2/15/16
to Druid User
Thanks Fangjin.

Two comments:
1. We assumed that 1 minute segment and query granularity would give us better query performance compared to 1 hour segment and 1 minute query granularity. Is that true? We would like to know more on the query performance differences in those two options( (1 minute segments vs 1 hour segments) having 1 minute query granularity).

2. See it creates confusion when programatically Segment granularity can be equal to Query granularity. It looks like an obvious choice for use-cases like ours. I am wondering why Tranquility allows to create 1 minute segments when data can still be rolled up at 1 minute level? 
I think Query granularity should always be less than Segment granularity. From our experience 1 minute segments create more operational problems for end users.
For example:
b. Due to issue #1, we set rules on datasources to purge the segments after x days. We keep reducing x when a new datasource is added.
c. For each task, one folder is created. Since we used ext3, we hit the OS limit of max number of folders that can be created under a particular folder. This applies to many configurable directories like task.baseTaskDir, logs.directory, java.io.tmpdir and many more.
c. We had serious doubts about scaling the system since we wish to add more datasources.
d. Too many workers needed and you need more middle managers due to less single node capacity. 
    Here is a formula we often use to scale the system:
    NUM_WORKERS = (num_minutes_for_one_task * num_partitions  * 1replicant_per_datasource * num_DataSources) 
    **NUM_minutes_for_one_task = 7 to 8 minutes (no matter what is the segment or query granularity or memory capacity or cpu cores)

What you think about suggestion #2 and how can I submit it for consideration? Let me know in case of any questions.

Thanks,
Dave

Fangjin Yang

unread,
Feb 17, 2016, 4:09:37 PM2/17/16
to Druid User
Inline.


On Sunday, February 14, 2016 at 9:42:48 PM UTC-8, Dave S wrote:
Thanks Fangjin.

Two comments:
1. We assumed that 1 minute segment and query granularity would give us better query performance compared to 1 hour segment and 1 minute query granularity. Is that true? We would like to know more on the query performance differences in those two options( (1 minute segments vs 1 hour segments) having 1 minute query granularity).

More segments requires more cores to process data in parallel
There is overhead to scanning every segment.
Try to keep segment sizes 300-700mb
 
2. See it creates confusion when programatically Segment granularity can be equal to Query granularity. It looks like an obvious choice for use-cases like ours. I am wondering why Tranquility allows to create 1 minute segments when data can still be rolled up at 1 minute level? 

 
I think Query granularity should always be less than Segment granularity. From our experience 1 minute segments create more operational problems for end users.
For example:

The advice in the thread seem sound. Did you turn on segment merging?
 
b. Due to issue #1, we set rules on datasources to purge the segments after x days. We keep reducing x when a new datasource is added. 
c. For each task, one folder is created. Since we used ext3, we hit the OS limit of max number of folders that can be created under a particular folder. This applies to many configurable directories like task.baseTaskDir, logs.directory, java.io.tmpdir and many more.
c. We had serious doubts about scaling the system since we wish to add more datasources.

# of datasources doesn't really matter as much as # of segments. I think there's some wonkiness with your setup.
 
d. Too many workers needed and you need more middle managers due to less single node capacity. 
    Here is a formula we often use to scale the system:
    NUM_WORKERS = (num_minutes_for_one_task * num_partitions  * 1replicant_per_datasource * num_DataSources) 
    **NUM_minutes_for_one_task = 7 to 8 minutes (no matter what is the segment or query granularity or memory capacity or cpu cores)

I'm not entirely sure what you mean here.
Reply all
Reply to author
Forward
0 new messages