Configure Data retention by data age

20 views
Skip to first unread message

Diganta Mukherjee

unread,
May 12, 2022, 4:54:30 AMMay 12
to Druid User
Hi,
I have recently segregated my data node infrastructure into Hot and Cold Tiers.
I want segments for a particular datasource to be segregated into Hot and Cold Tiers based on the data age. For example:
If T = CURRENT_TIMESTAMP
Segments for last T to T-3M to be loaded into Hot Tier only(Last 3 Months data)
Segments for T-3M  to T-1Y to be loaded into Cold Tier only(Last year's data except for past 3 months)
Now the Period Load Rule mentions that  "The interval of a segment will be compared against the specified period."
So, if I setup the below rules, will I be able to achieve the intended data segregation or will the past 3M data get loaded into the cold tier as well since the past 3 months segments will match the P1Y period? Could you please provide a sample spec for same or provide some pointers to achieve this requirement?
{
  "type" : "loadByPeriod",
  "period" : "P3M",
  "includeFuture" : true,
  "tieredReplicants": {
      "hot": 2
  }
}
{
  "type" : "loadByPeriod",
  "period" : "P1Y",
  "includeFuture" : true,
  "tieredReplicants": {
      "cold": 2
  }
}

Mark Herrera

unread,
May 12, 2022, 11:37:05 AMMay 12
to Druid User
Hi,

You've segregated your data node infrastructure into Hot and Cold tiers, so something like "_hot_tier" and "_cold_tier", along with setting druid.server.tier in the historical config runtime.properties? That would make your spec essentially correct, and you'd only have to add:

{
  "type" : "loadByPeriod",
  "period" : "P3M",
  "includeFuture" : true,
  "tieredReplicants": {
      "hot": 2,
     "_hot_tier":2

  }
}
{
  "type" : "loadByPeriod",
  "period" : "P1Y",
  "includeFuture" : true,
  "tieredReplicants": {
      "cold": 2,
      "_cold_tier":2
  }
}

Then make sure to update druid.broker.segment.watchedTiers in the broker's runtime.properties. Something like:

druid.broker.segment.watchedTiers=["_hot_tier", “_cold_tier”]

Finally, you can configure the router to ensure that queries are being sent to the right broker for a given tier. Something like:

Broker 1: `druid.service=druid/broker-hot`
Broker 2: `druid.service=druid/broker-cold`

druid.router.tierToBrokerMap={"_hot_tier": "druid/broker-hot", "_cold_tier": "druid/broker-cold"}

Let us know how it goes.

Diganta Mukherjee

unread,
May 13, 2022, 4:23:12 AMMay 13
to Druid User
Hi Mark,
Appreciate your quick response and confirmation of the specs. Will try this out.

Reply all
Reply to author
Forward
0 new messages