How to structure a top N freq materalized view or similar

Skip to first unread message

Daniel Trembath

May 19, 2021, 7:25:13 AMMay 19
to ClickHouse
Forgive me if this is the wrong place to ask database design questions.

I have a Clickhouse database that absorbs DNS requests and I would like to keep a materialised view of the most frequently searched N domains each period of time (eg top 20 domains each 5 minutes for a year) for graphing.

The problem is we ingest an enormous number of DNS requests. I have to TTL DELETE the raw data after 2 days. So I can't using a SummingMergeTable of every domain queried as it would still be very large. I don't care about 99.9999% of the records in that aggregate data.

I guess I could build a separate script that queries the main table every 5 minute, GROUP BY / sum() / LIMIT BY and inserts the data into a regular table.
But this feels like a common use case (track the most top most frequent subset over time), so I feel like I'm missing something obvious.

Thank you in advance,

Reply all
Reply to author
0 new messages