Forgive me if this is the wrong place to ask database design questions.
I have a Clickhouse database that absorbs DNS requests and I would like to keep a materialised view of the most frequently searched N domains each period of time (eg top 20 domains each 5 minutes for a year) for graphing.
The problem is we ingest an enormous number of DNS requests. I have to TTL DELETE the raw data after 2 days. So I can't using a SummingMergeTable of every domain queried as it would still be very large. I don't care about 99.9999% of the records in that aggregate data.
I guess I could build a separate script that queries the main table every 5 minute, GROUP BY / sum() / LIMIT BY and inserts the data into a regular table.
But this feels like a common use case (track the most top most frequent subset over time), so I feel like I'm missing something obvious.
Thank you in advance,