You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to rocksdb
There was a question about not using the OS page cache with RocksDB. It was shared as a github issue and I prefer to limit github to bugs and feature requests.
The question from https://github.com/facebook/rocksdb/issues/1031 was: ----- i am using MongoDB with RocksDB as the storage engine. i want to compare the performance with or without block cache. For both case i disabled OS buffer and there's nocompression. when enable block cache, the size is 4GB. the workload is YCSB's workloada,load data is 24,000,000 1KB records. operation count is 48,000,000.
distribution is uniform. i suppose it would be not much different between enable block cache and disable it, since cache size is much smaller than total loaded data, and the distribution is uniform, there's no locality. However, with block cache, the performance is much better (2.5X) than without block cache. By doing a little profiling, i found the benefits is coming from index locate path inside MongoDB. In MongoDB, RocksDB engine stores the index as a key-value, key is the indexed fields and value is the document id. For with block cache case, it seems this key-value which stores the index information is in memory. Considering the index data is a normal key-value as the actual document in RocksDB, how could RocksDB ping the index key-values in memory? -----
I wonder about the performance hit from not using the OS page cache with RocksDB because:
1) by default compressed blocks are in the OS page cache, uncompressed blocks are in the RocksDB block cache. For the question above, compressed was not used but I expect most users to benefit from compression. RocksDB has an option to also cache compressed blocks but that is not widely tested, nor widely used and I think it needs some work. For example, I'd rather not statically partition the RAM in RocksDB between compressed and uncompressed blocks.
2) AFAIK output (new files) from RocksDB compaction isn't put into the RocksDB block cache. With buffered IO it is put into the OS page cache by default. Files in levels 0, 1, and 2 are frequently created and replaced. If the OS page cache isn't used, then the first access to these files after compaction means we read more data from storage. This wastes IOPs.