First, do not underestimate the impact of a virtualized environment on
network throughput. It is significant, especially if your hypervisor
or the Linux kernel are not correctly tuned.
A 50% overhead is what I could routinely measure with redis-benchmark
on ESX VMs a few years ago. (i.e. half the throughput on a VM, compared
to an equivalent physical box).
That said, Redis is clearly not optimized to serve large objects.
On one side, the execution of all commands has to be atomic.
On the other side, transferring 1 MB of data on the wire can take time.
So Redis has to buffer the output of the command.
If the command is a GET on a 1 MB object, it means a 1 MB object will be copied
first in the output buffer, before being progressively copied in the socket buffer
while the client is ingesting the reply. This extra copy is needed because the
object can be updated just after the GET command by another session. It is
not enough to refer to the object in the output buffer, Redis has to copy it.
I don't know how Cassandra works, but perhaps it is able to avoid this copy by
referring to immutable objects.
Regards,
Didier.