Druid taking more storage on disk than it shows.

20 views
Skip to first unread message

Dejan Zegarac

unread,
Jul 2, 2024, 5:00:05 AMJul 2
to Druid User
I'm having an issue where my druid datasource shows that it's using for 20GB, but on disk, it's using 100+GB.

I use shared network partition (storage NAS) for deep storage.

CleanShot 2024-07-02 at 10.56.27.png

CleanShot 2024-07-02 at 10.57.14.png

This dataset is also configured to drop all data that is older than a month, but in fact, I have data from the beginning of this year, altho older data seems to be taking less space than newer.

CleanShot 2024-07-02 at 10.58.36.png

There are segments even from the fifth month...

I did a kill for unused segments before writing this.

Any insights?

John Kowtko

unread,
Jul 4, 2024, 3:02:56 PMJul 4
to Druid User
Is this only deep storage space you are talking about?  Or does this include the segments when loaded onto the Historicals?

Maybe check the "druid_segments" table in the metadata DB to see what segments should still exist in the system, and see if there are more actual segment files in deep storage than what is listed in druid_segments ...

Let us know if you find any discrepancies.

If the information matches up, then the question is, are the Kill tasks actually removing the segment files and metadata records?  You should be able to tell this by looking at the logs of the kill tasks.

Thanks.  John

Reply all
Reply to author
Forward
0 new messages