MemcachedKeyLengthError

62 views
Skip to first unread message

Jeff Anderson

unread,
Mar 26, 2010, 11:54:28 AM3/26/10
to smug...@googlegroups.com
Hello,

I've occasionally gotten reports with tracebacks similar to the following:

<snip>

File "/home/httpd/www.programmerq.net/smug/smug/load.py", line 196, in lookup
values = cache.get(self.key)

File "/usr/lib/python2.6/site-packages/django/core/cache/backends/memcached.py", line 25, in get
val = self._cache.get(smart_str(key))

File "/usr/lib/python2.6/site-packages/memcache.py", line 698, in get
check_key(key)

File "/usr/lib/python2.6/site-packages/memcache.py", line 994, in check_key
% SERVER_MAX_KEY_LENGTH)

MemcachedKeyLengthError: Key length is > 250

The URL that was visited when this error occurred was similar to this:
http://www.example.com/blog/2011/dec/13/reallyreallylong/bogus/path/+[PLM=0]+GET+http:/www.example.com/blog/2011/dec/13/reallyreallylong/bogus/path/+[0,10798,10099]+-%3E+[N]+POST+http:/www.example.com/blog/2011/dec/13/reallyreallylong/bogus/path/+[0,0,10099]/

Obviously someone was scanning for a potential vulnerability, and when my blog
app didn't return a match to the URL, it got passed along to smug since smug
catches all last.

The error is that smug uses the path verbatim when determining what memcached
key to use, and there is a maximum that memcached supported. The following
Django ticket illustrates a fix that was done to the caching templatetag in
Django's core: http://code.djangoproject.com/ticket/11270

Basically they made it so that it'll never make a key that long. The method
they chose was to generate an md5 hexdigest of the object. Would a similar fix
work here?


Thanks!

Jeff Anderson

Andrew McNabb

unread,
Mar 26, 2010, 1:31:53 PM3/26/10
to smug...@googlegroups.com
On Fri, Mar 26, 2010 at 09:54:28AM -0600, Jeff Anderson wrote:
>
> The error is that smug uses the path verbatim when determining what memcached
> key to use, and there is a maximum that memcached supported. The following
> Django ticket illustrates a fix that was done to the caching templatetag in
> Django's core: http://code.djangoproject.com/ticket/11270

That's like an interesting problem.


> Basically they made it so that it'll never make a key that long. The method
> they chose was to generate an md5 hexdigest of the object. Would a similar fix
> work here?

I think that fix would work here, too. I would probably use sha1
instead of md5, but that's an implementation detail.

Would you like to create a patch, or would you prefer that I look at it?
Note that if it's me, it will probably be about a week before I have a
chance. The testing will probably require more time than the change
itself.

I think this would involve a one-line change in CachedFile.__init__.
Should this also be done in HeadCache.__init__?

Another question: should we only do this for keys longer than a certain
length, or should we do it for all keys? I'm not sure whether the
hashing will take much CPU time; if it's non-negligible, we could skip
this step for keys less than, say, 30 characters long.

--
Andrew McNabb
http://www.mcnabbs.org/andrew/
PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868

Jeff Anderson

unread,
Apr 3, 2010, 12:36:28 PM4/3/10
to smug...@googlegroups.com
Andrew McNabb wrote:
>> Basically they made it so that it'll never make a key that long. The method
>> they chose was to generate an md5 hexdigest of the object. Would a similar fix
>> work here?
>>
>
> I think that fix would work here, too. I would probably use sha1
> instead of md5, but that's an implementation detail.
>
<snip>

> Another question: should we only do this for keys longer than a certain
> length, or should we do it for all keys? I'm not sure whether the
> hashing will take much CPU time; if it's non-negligible, we could skip
> this step for keys less than, say, 30 characters long.
>

Hello,

I'm happy to write a patch, but I think that doing it conditionally
based on length is silly, as well as using a sha1 hash. sha1 wasn't
written with speed in mind, and md5 is definitely quicker (I didn't do a
benchmark, but that's what I read). No cryptographic validation is
happening when doing cache saves and hits, so the extra security of sha1
over md5 isn't necessary. A hard-coded salt will help with an attacker
trying to create a collision, but doesn't have the extra overhead of
generating sha1 hashes.

Essentially I'm saying I like the way that this same problem was fixed
in the django ticket tracker-- create an md5 sum of every string, but
adding a salt is okay too.


Thanks!

Jeff Anderson

signature.asc

Andrew McNabb

unread,
Apr 4, 2010, 11:30:40 PM4/4/10
to smug...@googlegroups.com
On Sat, Apr 03, 2010 at 10:36:28AM -0600, Jeff Anderson wrote:
>
> I'm happy to write a patch, but I think that doing it conditionally
> based on length is silly, as well as using a sha1 hash. sha1 wasn't
> written with speed in mind, and md5 is definitely quicker (I didn't do a
> benchmark, but that's what I read). No cryptographic validation is
> happening when doing cache saves and hits, so the extra security of sha1
> over md5 isn't necessary. A hard-coded salt will help with an attacker
> trying to create a collision, but doesn't have the extra overhead of
> generating sha1 hashes.
>
> Essentially I'm saying I like the way that this same problem was fixed
> in the django ticket tracker-- create an md5 sum of every string, but
> adding a salt is okay too.

An md5 hash is probably fine. Theoretically, you might be able to
construct some situation that makes the wrong page appear or something,
but I'm sure I could come up with a dozen more plausible threats. :)

Thanks for your help.

Reply all
Reply to author
Forward
0 new messages