A question about read-modify-write performance testing.

78 views
Skip to first unread message

Qian Wang

unread,
Dec 7, 2022, 4:03:42 AM12/7/22
to rocksdb
In https://github.com/facebook/rocksdb/wiki/Read-Modify-Write-Benchmarks, I noticed that RocksDB conducted comparative experiments for Merge and read-modify-write operations. But I have doubts about implementation details in db_bench. The steps RocksDB take to use UpdateRandom are:
1. Read a random key
2.Write a new value to the random key
But its semantics are essentially different from the MergeRandom. Because the new value written using UpdateRandom has no relationship with the value read, which is different from Merge (the semantics of the latter is: the value written is the value read + 1).
This problem may not be apparent in examples using counters, but consider the StringAppendOperator. If we use this MergeOperator, the semantics of MergeRandom will make the size of the value continuously increase, but the semantics of UpdateRandom will not continuously append the value, so the size of the value will remain unchanged.
Should we consider modifying the UpdateRandom function? After reading out the key, perform certain operations on the corresponding value to generate a new value and then write it in instead of randomly writing a new value like now,.Or add a RMWRandom function to realize the semantics I mentioned above?
Thank you!

Mark Callaghan

unread,
Dec 7, 2022, 6:09:22 PM12/7/22
to rocksdb
That is an interesting question.

If you are using a skewed key distribution then the same key should probably be used for the write that follows the read. I think this is less of an issue when the key access distribution is uniform. 

To be the most accurate, the update (read+write) should be done in a transaction. Especially if there is useful data in the value, like a counter.
 
I haven't been concerned about this because I don't use that benchmark much, I assume that the value is just random junk and I usually use uniform key access distribution. But I think it could be fixed as you suggest.

Qian Wang

unread,
Dec 8, 2022, 1:06:53 AM12/8/22
to rocksdb
Thank u!
It seems that this question is related to the semantics of read-modify-write, and has nothing to do with the distribution of keys. Maybe I can raise an issue based on this.

Mark Callaghan

unread,
Dec 8, 2022, 10:59:04 AM12/8/22
to rocksdb
I misunderstood your question, and the code in question (https://github.com/facebook/rocksdb/blob/main/tools/db_bench_tool.cc#L7377). It uses the same key for each read/write pair. As you write above, it just overwrites the previous value. But most of the tests that do writes also do just that -- clobber the existing value. So if you open an issue please be specific about the behavior you want it to have on writes, and then explain whether you want that supported for all of the benchmarks that do writes. That might be a large diff to implement.

Qian Wang

unread,
Dec 9, 2022, 4:15:36 AM12/9/22
to rocksdb
Yes! I've opened a related issue with a more detailed explanation of the problem and my desired improvement. can be seen here~ https://github.com/facebook/rocksdb/issues/11025

Qian Wang

unread,
Dec 12, 2022, 10:09:21 AM12/12/22
to rocksdb
Hello!I don't know why no one responded to my issue.
The behavior I would like to have on write is - new_value is a value relative to old_value(such as old_value+1,old_value append a value,and other things can be specified by the user), not a randomly generated one.
It is not required that all write benchmarks support this behavior, only that updaterandom increases it. Maybe we can define a new benchmark.
My original intention is to implement an benchmark with the same semantics as Mergerandom. Obviously, the existing behavior of randomly generating a new value is different from the semantics of Mergerandom, so what is the meaning of the comparison between mergerrandomand updatedrandom?
Reply all
Reply to author
Forward
0 new messages