Hi Rodrigo,
The big question here is, what kind of benchmarks are you looking for? We can create as many synthetics as there are ideas out there, but the most informative benchmark you can have is your own environment or as close as your dataset you can get if you still don't have a running system. Most benchmarks out there are measuring stuff that is never found in practice. We do use certain scenarios to guide our decisions and therefore has such benchmarks internally, but they are of little use to understand how a live environment would behave for your use case.
Just to showcase some.
One of our guys tried the Raspberry Pi with its own internal storage and could get 1K write transactions per minute. For reference 600 write transactions per second it is 50M write transactions per day; you know these numbers add up fast.
If we attach a second Raspberry Pi acting as a controller of an external disk, one for RDB and the other for the disk we can get 6K per minute…
For context, if you take the long tail of the internet probably 70% of websites could actually be served by a 100USD setup composed of Raspberry Pi’s, I am not saying it is a good idea, I am saying that you can.
You know the stackoverflow dataset? Stackoverflow dumps every once in a while all their content and put it for download via torrent. We use it for benchmarking internally because it is as real world as it gets. The dump we use is 59 GB of data. We are able to push all that, with full-text search, indexing and also some map-reduce jobs to count stuff in under 35 minutes in a x.large AWS instance (8 cores, 16HT). That’s 59 Gb of data in a single import. But here is why I tell you benchmarks paint a picture that sometimes is not indicative of actual performance in your use case. We are not pushing those 59Gb of data through the network, we once tried that, and we saturated the whole network link and we couldn’t pass 20% to 25% in CPU usage… so for the actual test, I have to keep aside a core to do the actual pushing of the data from localhost.
For certain document sizes (small but not that small) we could achieve 35K write transactions per second on commodity hardware. On my gaming computer at home, a Core i7 7700 with 32Gb of memory I can get 35K transactions per second doing both the pushing from hundreds of clients threads and running the server at the same time in the same machine. You remember? From reference 600 per seconds is 50M per day, the math is pretty simple to do.
I use this to see how we behave under contention scenarios, we actually uncovered a repro for a CoreCLR runtime error doing so https://github.com/dotnet/coreclr/issues/13388 You won't get numbers more official as the ones that I am telling you about here because those are the numbers we use to guide our optimization efforts, but they are certainly NOT indicative of your use case, your dataset, your hardware, etc. As they say: "There are lies, damned lies and statistics". Synthetic benchmarks while good statistics to guide some hypothesis are damned lies for those seeking to understand runtime behavior. ;)
The best benchmark is your own reality. Some users are comparing RDB 4.0 performance against they own system implemented in RDB 3.x
like Kamran Ayub tell on DotNetRocks at
http://ow.ly/usgH30jK57THope that helps.
Federico