Michael Nitschinger told me recently that he did a comparison and did not find a significant difference recently. I don't know if that was with this benchmark or his own. My suspicion is that it was with his own.
Note that that data is from 2010 and is release 2.5 of spymemcached. A lot has changed since then with memcached itself, the clients and the servers. I know for a fact that some things added would help performance, and some things added could harm performance, but are there for correctness.
The other thing that is missing here is any sense of resource utilization, which is important to performance of the overall system. I'm not saying spymemcached is better here-- I just don't know-- but I do know there had been some experiments which would burn CPU to get higher throughput.
There is one other either fortunate or unfortunate reality, and that's that spymemcached tries to use only one TCP connection to each server. Theoretically speaking, this should allow for the most efficient use of resources between the two. In practice, various things have buffers per TCP connection[1] and thus you can, depending on OS and TCP tuneables and such, actually achieve higher throughput by opening more than one connection.
We've definitely observed this with spymemcached. Running with many objects in artificial workloads can give you more throughput than coordination on one object.
All of that said, we don't normally recommend starting there. Why? It's more complicated to write and maintain and for most real applications, the latency and throughput of spymemcached isn't the tallest poll in the tent.
By the way, one of the largest causes of CPU utilization in many cases seems to be serialization. You may be able to cut quite a bit of time and space out by changing the format of the data you're storing or using the Transcoder interface and 3rd party serialization. Of course, that all means doing more coding and that may take away from lower hanging fruit for optimization.
Of course, we're definitely looking into opportunities to enhance performance too and will continue to. Any suggestions are welcome!
Matt