Understanding RocksDB memory usage on opening databases

502 views
Skip to first unread message

Manuel Holtgrewe

unread,
Sep 6, 2023, 2:54:26 AM9/6/23
to rocksdb
Hello,

I'm using RocksDB for large lookup tables that are written once and then only opened in read only mode. I'm facing trouble with high memory usage for some of these databases and I would like to understand the memory usage when opening the file and options for reducing the memory.

A typical such lookup table corresponds to 400GB of data with SST files of ~1GB, so ~400 SST files. I took care of forcing compaction on the database when it was originally written and the (then empty) WAL file is removed before opening as read-only.

It is my understanding that when opening a file, the rocksdb library will consider each SST file and create some data structures in memory for each. Could anyone point me at places where I can learn more about tuning the used main memory? Ideally, I would like to find out about options settings on opening the files to reduce memory usage. If necessary, I could re-create the SST files with different settings (is there anything in a SST file that can make the library use more/less memory when opening it?)

My "read only" workload consists more or less of random access to a small fraction of the values so it would not benefit from any caching.

Cheers,
Manuel

Dan Carfas

unread,
Sep 6, 2023, 4:07:06 AM9/6/23
to Manuel Holtgrewe, rocksdb
From the Speedb Hive:

When you open a file it normally loads the index and filter blocks, which probably explains why there is such a huge memory consumption. Putting the index and filter as part of the cache (cache_index_and_filter=true) will work to reduce memory alas it will probably cause a performance penalty. We have, in our new OS release 2.6, introduced a new way of memory manager (pinning to the memory of filters and indexes up until a user limit). feel free to join our community and download 2.6 which is fully compatible with RocksDB 8.1. If you need us to work with you and further optimize for your specific use case please contact me directly.

You can find the Speedb hive here and (once you've registered) the link to the thread with your question here, if you have more questions or need additional info

--
You received this message because you are subscribed to the Google Groups "rocksdb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rocksdb+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rocksdb/38b8212d-7e8f-46a4-a82b-dcbfae26502dn%40googlegroups.com.

Manuel Holtgrewe

unread,
Sep 12, 2023, 10:13:43 AM9/12/23
to Dan Carfas, rocksdb
Dear Dan,

thank you for your reply. I also found this information here about partitioned indices which is supposed to help and some mention of ribbon filters rather than bloom filters. I guess I have to try a couple of variations for my use case.


It is a bit hard to see which settings have which influence from the documentation alone.

Is my impression correct that in the end, the main documentation source is skimming through the rocksdb sources to figure out the details?

Cheers,
Manuel

Dan Carfas

unread,
Sep 12, 2023, 3:37:47 PM9/12/23
to Manuel Holtgrewe, rocksdb
Hi Manuel,

You're right, it can be difficult to learn about the complex relationship between the different RocksDB settings from the documentation alone. We'd be happy to help you optimize your use case. You can join our community and set up a meeting with our engineering team to discuss your specific needs.

Thanks,
Dan

Manuel Holtgrewe

unread,
Sep 15, 2023, 6:15:50 AM9/15/23
to Dan Carfas, rocksdb
Dear Dan,

thank you again for your reply.

After rebuilding my RocksDB tables with partitioned indexes/filters, the memory requirements have gone down drastically. Performance is not so much of an issue here, I'm aiming primarily at small (compressed) data at rest plus acceptably low memory requirements. So for anyone finding this in the future: if your requirements are the same as mine, use partitioned index/filters to rebuild your database and memory requirements should go down.

Reply all
Reply to author
Forward
0 new messages