Any performance benchmarks centered around key length?

880 views
Skip to first unread message

GregA

unread,
Mar 10, 2011, 12:53:20 AM3/10/11
to Redis DB
Hello, I'm debating with one of my company's developers about whether
the length of the keys in our GET and SET commands will impact the
performance of the Redis server. The lengths we're talking about are
30-80 characters, but I'm interested in information about longer keys
as well.

I found the /sys/toilet blog from August where the blogger did some
benchmarking of the effects of key length up to 200 characters, but
I'd like to see if anyone else has test results they can share.

Thanks!

Josiah Carlson

unread,
Mar 10, 2011, 1:55:38 AM3/10/11
to redi...@googlegroups.com, GregA
I don't have any results either way, but...

Considerations for whether 30/80 byte keys makes a difference:
1. whether your data size would push the packet beyond the ethernet
frame size and/or MTU
2. whether the key being larger than a 64 byte cache line makes a
difference in hashing
3. whether the key being larger than a 64 byte cache line makes a
difference in comparison of keys after hashing
4. other IO, interrupts, or network weather at the time

My gut says that network weather and interrupt handling delays would
tend to dominate individual set/get performance more than anything
else in a simple "web request comes in, send the client the response
from a small number of Redis GET calls", though streaming a large
volume of requests via pipeline may offer a slight performance
advantage to the short keys for otherwise unloaded networks.

Regards,
- Josiah

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

Jak Sprats

unread,
Mar 11, 2011, 2:14:49 PM3/11/11
to Redis DB

I tested key length's impact on performance about 6 months ago and
basically everything Josiah wrote is correct.

request or responses gradually get slower as the number of bytes in
the request or response grow and there are visible latency jumps as
the request and response sizes cross MTU boundary sizes.

The deradation in speed w/ keys from sizes of 1-1500 bytes is not all
that significant (cant remember exactly, maybe 1500 is 2X slower than
2 bytes), but there is a big difference between 1450 and 1550 as it
crosses the MTU size, the request becomes two packets which may have
to be reassembled, etc...

30-80 byte keys or even 200 byte keys should be basically as fast as
10 byte keys to the best of my knowledge.

The cool thing during my tests was that the overall speed was
basically dictated by the number of packets being transmitted, all of
the other factors (hash-table computations, memory lookups, etc...
played no significant role). Again these tests were only for SET/GET
so there is not a lot of computation going on.

On Mar 9, 11:55 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
> I don't have any results either way, but...
>
> Considerations for whether 30/80 byte keys makes a difference:
> 1. whether your data size would push the packet beyond the ethernet
> frame size and/or MTU
> 2. whether the key being larger than a 64 byte cache line makes a
> difference in hashing
> 3. whether the key being larger than a 64 byte cache line makes a
> difference in comparison of keys after hashing
> 4. other IO, interrupts, or network weather at the time
>
> My gut says that network weather and interrupt handling delays would
> tend to dominate individual set/get performance more than anything
> else in a simple "web request comes in, send the client the response
> from a small number of Redis GET calls", though streaming a large
> volume of requests via pipeline may offer a slight performance
> advantage to the short keys for otherwise unloaded networks.
>
> Regards,
>  - Josiah
>

Don Marti

unread,
Mar 11, 2011, 3:15:10 PM3/11/11
to redi...@googlegroups.com
begin Jak Sprats quotation of Fri, Mar 11, 2011 at 11:14:49AM -0800:

> The cool thing during my tests was that the overall speed was
> basically dictated by the number of packets being transmitted, all of
> the other factors (hash-table computations, memory lookups, etc...
> played no significant role). Again these tests were only for SET/GET
> so there is not a lot of computation going on.

Could you see any differences in timing by length
of partial matches? I'm just wondering about timing
attacks (so never using session cookies as keys just
to be on the safe side).

--
Don Marti
http://zgp.org/~dmarti/
dma...@zgp.org

Jak Sprats

unread,
Mar 12, 2011, 12:01:49 PM3/12/11
to Redis DB
Hi Don,

I dont 100% understand what you mean by partial matches, because in
redis partial matches are misses. Misses would depend solely on key-
length as the response would be around 5 bytes.
W/ session cookies (max size 4KB), set a max key-length of 1500, you
should be ok, no matter what anyone does.

- Jak

p.s. actually not 1500, its something more like 1468 w/ the tcp/ip
overhead factored in

Santiago Perez

unread,
Mar 14, 2011, 9:36:10 PM3/14/11
to redi...@googlegroups.com, Jak Sprats
I think what Don is worried about are measurable differences in the execution time of misses with common prefixes with existing keys. Those could be used by attackers to build existing keys and potentially accessing private data. 

I don't know the code very well, but since redis uses a hashtable, the attacker would need to besides matching a prefix of the existing key be lucky enough for both keys (the real one and the attack one) to resolve to the same hash. Then, a comparison of both keys would take place and there could be a chance of a measurable difference. The chances of finding common prefixes while resolving to the same hash are low and make the attack at least N times harder (where N is the number of buckets in the hashtable).

Josiah Carlson

unread,
Mar 15, 2011, 2:24:18 AM3/15/11
to redi...@googlegroups.com, Santiago Perez, Jak Sprats
The hash algorithm is very simple. Finding collisions for any given
hash size would be easy given a moderately intelligent hacker.

But re-read the earlier posts where we discuss what would make the
most difference. Hint: it's not really key size.

- Josiah

Don Marti

unread,
Mar 15, 2011, 2:23:21 PM3/15/11
to redi...@googlegroups.com
begin Josiah Carlson quotation of Mon, Mar 14, 2011 at 11:24:18PM -0700:

>
> The hash algorithm is very simple. Finding collisions for any given
> hash size would be easy given a moderately intelligent hacker.
>
> But re-read the earlier posts where we discuss what would make the
> most difference. Hint: it's not really key size.

What I'm doing, and this may be overkill, is taking
HMAC(secret key, user-visible session cookie) and
using that as the Redis key. That way -- I think,
please correct me if I'm wrong -- it should make it
harder to do a timing attack on session cookies,
even if Redis takes longer to return a "no match"
on a near match, and someone who is trying to guess
session cookies has a copy of Redis to play with.

Josiah Carlson

unread,
Mar 15, 2011, 2:37:03 PM3/15/11
to redi...@googlegroups.com, Don Marti
If you are using any remotely cryptographically secure hash as a key
with Redis (you've got a secret), the likelihood of them being able to
get two keys to get close to colliding so that more than the first few
characters of the key match are very unlikely, never mind being able
to get the bits lined up so that hashes also collide in Redis. I
wouldn't worry, even if network latency was zero. With nonzero network
latency, I'd probably have a glass of scotch, a cigar, and watch an
old movie.

- Josiah

Reply all
Reply to author
Forward
0 new messages