Hello everyone,
Looking for some clues to help debug an interesting performance issue with RocksDB. Thanks in advance!
Basically, we have an application that has two ways to ingest data:
Method 1. is doing DB.get(key), the check its values, and then do DB.put(key, newValue) conditionally.
Method 2. is to build SST files offline and ingest them via the ingestExternalFile() method, and then ingesting any remaining K/V pairs doing the DB.get()/check/DB.put() flow.
We found Method 1 was not fast enough to bootstrap the database when application restarts, so we added Method2, but found it was not always faster than Method 1, particularly it's much slower to do DB.get() when handling the remaining K/V pairs.
The ingestExternalFile() method could finish very quickly to ingest majority of the K/V pairs in few seconds vs. few minutes of Method 1. But processing the remaining K/V pairs took very long, making the total duration even longer than Method 1 sometimes. The remaining K/V pairs might be 10% or less of all the K/V pairs.
Looking into RocksDB logs, we saw the `db.get.micros` from Method 2 was like 10X higher than those from Method 1, roughly 300us vs. 30us.
With async-profiler, we could see the two methods had very similar cpu profiles (attached two example profiles from the two methods)
The two methods used same DB/ColumnFamily configs. So the SST files exported from Method 2 used same DB/ColumnFamily configs too. Basically, using BlockBasedTable with same compression LZ4 and index HashIndex.
So I'd wonder if ingestExternalFile() has any known performance impact on following DB.get()s? Perhaps the block cache is not warmed when ingesting SST files directly. Or any other metrics I can check to debug this further. If it's cold cache, any efficient way to warm it after ingesting SST files?
Any clues to help debug this further are welcomed. Thanks!!
best,
Kane