This behavior is extremely dangerous-- lest you be aware, cmemcache
and python-memcached disagree on how to choose memcached in a list.
Now set up four-five machines with slightly varying needs, disaster.
The only two possible solutions:
1. ONLY use either cmemcache or python-memcached.
2. Use either cmemcache, or python-memcached with my cmemcache_hash
module which makes python-memcached hash like cmemcache does in
choosing server.
And before I get the "but it works for 90% of us" - I'm sorry, but
you'll have to choose. This took _ages_ to track down, and caused
personal loss as well as execution of a number of kittens.
Ludvig Ericson
ludvig....@gmail.com
Could you create a documentation patch, or at least open a ticket about
this, please, so that somebody else creates a documentation patch? This
is definitely worth recording in the docs. It's ultimately a
configuration issue with the people setting up the four or five machines
(although see below for how we can make it controllable without harming
the status quo).
Have you filed your patch for python-memcached with the upstream
maintainer yet? He's usually fairly responsive to bug reports. I haven't
had any personal dealings with the cmemcached guys, but they might be
interested in some sort of common agreement on hashing algorithms as
well.
Django should continue to try to do both imports by default, since even
if we introduced a setting, there's still no guarantee that the setting
would be the same on all the different installations and there's no way
to ensure that it is automatically (it's a configuration issue). I'm
very much in favour of "should work out of the box regardless of which
memcache wrapper you have installed" for the time being, at least.
However, a setting to force it to one or the other (and if no setting is
present we do the current fallback) would be a good solution for the
multi-machine installation case. Then somebody scaling to multiple
machines could use the same settings file with confidence that things
would fail with prejudice if they configured things correctly but left
off installing cmemcached.
Best of all (and in parallel), though, would be the idea where the
different memcached modules used the same hashing algorithm.
[...]
> This took _ages_ to track down, and caused
> personal loss as well as execution of a number of kittens.
Your violence towards animals is noted with disappointment. However,
when I rule the world there may be a place for you in my organisation.
Regards,
Malcolm
> Could you create a documentation patch, or at least open a ticket
> about
> this, please, so that somebody else creates a documentation patch?
> This
> is definitely worth recording in the docs. It's ultimately a
> configuration issue with the people setting up the four or five
> machines
> (although see below for how we can make it controllable without
> harming
> the status quo).
Opened ticket #9324 now.
http://code.djangoproject.com/ticket/9324
And yes, it's a configuration issue, but to err is human. :-)
> Have you filed your patch for python-memcached with the upstream
> maintainer yet? He's usually fairly responsive to bug reports. I
> haven't
> had any personal dealings with the cmemcached guys, but they might be
> interested in some sort of common agreement on hashing algorithms as
> well.
I haven't, but I haven't tried either -- if I got my say, I'd go for
cmemcache's, because pgmemcache does employ the same algorithm, and so
does the C library libmemcached (which python-libmemcached uses, as
well as my own pylibmc) to my knowledge.
I'll fire off an e-mail later, should be a quick change for him, as he
could just rip out the hash function from my cmemcache_hash.
> Django should continue to try to do both imports by default, since
> even
[...]
> However, a setting to force it to one or the other (and if no
> setting is
[...]
> Best of all (and in parallel), though, would be the idea where the
> different memcached modules used the same hashing algorithm.
I'm convinced and yes, this would remedy the issue to a great extent -
though it really is just moving the issue from having libraries
installed or not to having settings set. But a setting with some
googlable (new adj) documentation would be "friggin' ace, mate".
I suggest the following to be added to the docs (or something like it):
If you're experiencing problems where a user is logged out
sporadically, you should double-check that all your Django setups are
using the same memcached library.
You can tell Django to only use one library through the
`KITTENS_AND_BEER` setting.
> Your violence towards animals is noted with disappointment. However,
> when I rule the world there may be a place for you in my organisation.
Be sure to let me know.
Med vänliga hälsningar,
Ludvig Ericson
As I said in the ticket[1], using python-memcached with
cmemcache_hash[2] would probably be the most beneficial way to go.
So,
1. Try cmemcache
2. Try python-memcached
2.1. Try cmemcache_hash
Also, as said earlier, we should definitely have a setting for
enforcing one or the other.
[1]: http://code.djangoproject.com/ticket/9324
[2]: http://pypi.python.org/pypi/cmemcache_hash
(Apologies if the formatting got weird, but Mail.app is being real lame.)
- Ludvig
That's possible. The hash function in the pure-Python memcached wrapper
is replaceable (it's an attribute), so I was looking at replacing it
with the version from cmemcached. Using a third hashing algorithm would
be a bit silly, since then there's no compatibility with anything else
and it's just extra effort for no huge gain.
Regards,
Malcolm
And that's _exactly_ what my cmemcache_hash package does! :-)
http://pypi.python.org/pypi/cmemcache_hash
I don't know if I attached a license, but it's BSD - so feel free to rip
it.
- Ludvig
Okay. If we go this path, it's something to include in Django, rather
than recommending yet another caching package. We either make it a
configuration option to force python-memcache or cmemcache or we just
"Do The Right Thing", with the latter being preferable.
I hadn't realised from your earlier description that you had a
non-intrusive change that we could drop into Django. My
misunderstanding. Glad we're on the same page now.
Regards,
Malcolm
Absolutely -- that'd be my most favored solution as at that point it
wouldn't matter if you ran cmemcache or python-memcached (other than
that cmemcache might kill the process it runs in on protocol error.)
So, do you want me to make a patch or could you do it? I feel I'm not
entirely sure exactly where it should reside and so forth. But I could
take the time and find out if you're too busy.
> I hadn't realised from your earlier description that you had a
> non-intrusive change that we could drop into Django. My
> misunderstanding. Glad we're on the same page now.
I'm not a native speaker, so I might not be expressing my self in the
best English you've ever seen.
- Ludvig
What concerns me is that this will break the usage of memcached without
Django's cache API. I had the need a couple of times to do plain
instantiation of memcache.Client and work with it. If it won't see the
cache the same way as Django does it would be that very issue, hard to
debug, that started this thread.
This is indeed a concern. I was intending to put in a module that you
can import to get the same behaviour as Django. So instead of
import memcached
you write
from django.core.cache import memcached
I'm not 100% certain, though, that this is the way to go. I'm letting it
bounce around for a few days. Both options have their drawbacks and it's
kind of a matter of weighing up which inconvenience is more likely to
occur, given that they're both relatively uncommon (after all, if you're
accessing Django objects via direct usage, you need to be using Django's
get_cache_key() and the like anyway).
Regards,
Malcolm
True, but that's because python-memcached for some reason still uses its
own hashing algorithm (pure CRC32) while other libraries are more or
less
unified in their hashing algorithm. (Wouldn't know about libmemcached.)
*ugh* Why can you never eat the pie and have it. :(
- Ludvig