Deprecating cmemcache, adding pylibmc

115 views
Skip to first unread message

Jacob Burch

unread,
Feb 21, 2010, 8:32:19 PM2/21/10
to Django developers
This is in regards to Tickets #11675 and #12427.


Review of the problems
-----
There are two overarching problems with Django's current
implementation of memcached as a backend:

A) The import tree method of picking the appropriate library (if I
have cmemcache use it, if not use python-memcache, if not throw an
error).
B) Open support for the retired and buggy cmemcache (see
http://gijsbert.org/cmemcache/)


Both tickets are marked as DDN and while the two tickets are slightly
orthogonal, there seems to be a united solution: a more explicit and
inclusive method for picking a memcached library. With this in place,
deprecating out cmemcache because it is not longer supported is easy
as picking the milestones for deprecation and removal.

Including cmemcache, there are currently four major libraries for
accessing memcached: python-memcache, cmemcache, python-libmemcache,
and pylibmc. A few details on these along with speed benchmarks can be
found here: http://amix.dk/blog/viewEntry/19471

Proposed Solution
------
A slight rework of the memcache client that implements a base class
which resembles the current Django 1.1 memcache client. Anything that
implements the current CACHE_BACKEND setting "memcached://server:port"
would use this base class. Throw a deprecation warning if the
cmemcache library is being used in this way.

Additionally, add explicit subclasses for pylibmc and python-
libmemcache, so if "pylibmc://server:port", "memcached://server:port?
lib=pylibmc" or some variety there of controls what subclass gets
used.

Forcing the user to be explicit on what library they use could be
added later, although I don't know if it's necessary.

Pylibmc and python-memcache have almost identical API characteristics
(from pylibmc's docs: "pylibmc‘s interface is intentionally made as
close to python-memcached as possible, so that applications can drop-
in replace it."), and only differ in pylibmc's flexibility of memcache
behaviors and their hashing method. (See http://sendapatch.se/projects/pylibmc/#behaviors
and http://sendapatch.se/projects/pylibmc/#differences-from-python-memcached).
The subclass should end up being fairly lightweight.

I also am only partially familiar with python-libmemcache, so if
anyone can speak to key differences in implementation, it would be
very handy.

Hopefully this is enough of a discussion to get the ball rolling.

Jacob Kaplan-Moss

unread,
Feb 21, 2010, 11:22:03 PM2/21/10
to django-d...@googlegroups.com
On Sun, Feb 21, 2010 at 8:32 PM, Jacob Burch <jacob...@gmail.com> wrote:
> This is in regards to Tickets #11675 and #12427.

A bit more background: I've been told at PyCon that cmemcached is
unmaintained and deliberately being left to die in favor of pylibmc.

Because of that I'm +1 on your proposal, and I'll argue that we not
consider it a "feature addition" so it can go in for 1.2. If you can
work up a patch I'll be happy to review.

I have a couple of questions:

> A slight rework of the memcache client that implements a base class
> which resembles the current Django 1.1 memcache client. Anything that
> implements the current CACHE_BACKEND setting "memcached://server:port"
> would use this base class. Throw a deprecation warning if the
> cmemcache library is being used in this way.

I'm not quite clear how what you're talking about here. You want to
raise a deprecation warning about anyone who uses "memcached://"
instead of something new-style?

> Additionally, add explicit subclasses for pylibmc and python-
> libmemcache, so if "pylibmc://server:port",  "memcached://server:port?
> lib=pylibmc" or some variety there of controls what subclass gets
> used.

I discussed "memcached+pylibmc://", "memcached+cmemcache" etc. It's a
bikeshed, though, and if you write the patch you can paint it.

> Forcing the user to be explicit on what library they use could be
> added later, although I don't know if it's necessary.

"In the face of ambiguity, refuse the temptation to guess."

We've violated this rule by guessing once; I'd like to switch to
*just* use python-memcached, and force anyone who wants anything else
(including cmemcached) to be explicit. We can't do that in 1.2,
though, so maybe this is the warning you were talking about? i.e. warn
if cmemcached is chosen automatically?

> Pylibmc and python-memcache have almost identical API characteristics
> (from pylibmc's docs: "pylibmc‘s interface is intentionally made as
> close to python-memcached as possible, so that applications can drop-
> in replace it."), and only differ in pylibmc's flexibility of memcache
> behaviors and their hashing method. (See http://sendapatch.se/projects/pylibmc/#behaviors
> and http://sendapatch.se/projects/pylibmc/#differences-from-python-memcached).
> The subclass should end up being fairly lightweight.

Please make sure if you do this that there's a mechanism to use
pylibmc's consistent hashing -- it's the main reason to switch to
pylibmc. Actually, the more I think about it, if you're using pylibmc
then consistent hashing should be the *default*.

Thanks for getting this started!

Jacob

Rajiv Makhijani

unread,
Feb 21, 2010, 11:46:08 PM2/21/10
to django-d...@googlegroups.com
I recently implemented a custom caching backend to add support for
pylibmc on a large site due to issues with cmemcache.

For the most part the 'pylibmc' APIs are the same as 'python-memcached'
and 'cmemcache'.

One changed behavior I ran into with 'pylibmc' that I did not experience
with other caching libraries is that when calling get_multi() with an
empty list of keys, an exception was thrown. I do not believe this was
the case with 'cmemcache' so legacy code may not be designed to handle
this. To make my custom backend compatible, I had to change it to be:

def get_many(self, keys):
if keys == []:
return {}
return self._cache.get_multi(map(smart_str,keys))

As far as hashing goes, pylibmc has a different default hashing
mechanism than cmemcache or python-memcached. However, this was already
an issue for cmemcache and python-memcached for a while as they
previously had differing hashing mechanisms. To make the new
pylibmc-based caching backend compatible by default with
python-memcached and cmemcache hashing, an attribute can be set as such
on pylibmc:

mc.behaviors["hash"] = "crc"

However, I believe if we are adding a parameter for the library, there
should also be an optional parameter to specify the hashing mechanism,
i.e. memcached://server:port?lib=pylibmc&hash=crc.

Thanks,
Rajiv Makhijani

>> (from pylibmc's docs: "pylibmc�s interface is intentionally made as

Jeff Balogh

unread,
Feb 22, 2010, 11:49:37 AM2/22/10
to django-d...@googlegroups.com
On Sun, Feb 21, 2010 at 8:46 PM, Rajiv Makhijani <ra...@blue-tech.org> wrote:
> I recently implemented a custom caching backend to add support for pylibmc
> on a large site due to issues with cmemcache.
>
> For the most part the 'pylibmc' APIs are the same as 'python-memcached' and
> 'cmemcache'.
>
> One changed behavior I ran into with 'pylibmc' that I did not experience
> with other caching libraries is that when calling get_multi() with an empty
> list of keys, an exception was thrown. I do not believe this was the case
> with 'cmemcache' so legacy code may not be designed to handle this. To make
> my custom backend compatible, I had to change it to be:
>
> def get_many(self, keys):
>        if keys == []:
>                return {}
>        return self._cache.get_multi(map(smart_str,keys))

I filed a bug on this last night, and it's fixed already:
http://github.com/lericson/pylibmc/issues/issue/7

Mat Clayton

unread,
Feb 22, 2010, 12:36:31 PM2/22/10
to django-d...@googlegroups.com
a huge +1 on this, having been bitten by cmemcache multiple times. I know this will be controversial, and probably both totally unrelated but thought I'd bring up some other issues we have with caching: 

1. As we keep adding more memcache it gets increasingly painful, some form of consistent hashing algorithm would really be a big win, but this introduces a backwards compatibility issue with other memcache clients. In my opinion it would still be worth it, but I know others have said it is problem the past, but just thought I would flag it.

2. The current django cache stops you from setting a timeout of 0 and under some circumstances we use this, primarily for user profile info, its really frustrating having to hack round this, and I'd consider it a bug.

3. Allowing access to the minimum compression length, like most web apps, RAM not cpu is our issue, so we are happy to use up some cpu cycles to get more stuff into memcache.

Mike's http://github.com/mmalone/django-caching has some good examples of fixing 2/3

just my quick thoughts,
Mat


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-d...@googlegroups.com.
To unsubscribe from this group, send email to django-develop...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.




--
--
Matthew Clayton | Founder/CEO
Wakari Limited

twitter http://www.twitter.com/matclayton

email m...@wakari.co.uk
mobile +44 7872007851

skype matclayton

James Bennett

unread,
Feb 22, 2010, 5:09:39 PM2/22/10
to django-d...@googlegroups.com
On Sun, Feb 21, 2010 at 11:22 PM, Jacob Kaplan-Moss <ja...@jacobian.org> wrote:
> A bit more background: I've been told at PyCon that cmemcached is
> unmaintained and deliberately being left to die in favor of pylibmc.
>
> Because of that I'm +1 on your proposal, and I'll argue that we not
> consider it a "feature addition" so it can go in for 1.2. If you can
> work up a patch I'll be happy to review.

At this point in the release process, I'm not sure we can do
everything that's being talked about in this thread. Given that we're
feature-frozen and that there's no way we can spring a completely new
cache backend on people at the last minute, here's what's possible
within our release process right now for 1.2:

1. Specifying memcached as the cache backend continues using the same
"memcached://" scheme as it always has. There is no way we can change
that in the 1.2 timeframe.

2. The memcached backend in Django should look first for the correct
library, and fall back to the old one as needed.

3. When falling back to the old memcached library, 1.2 should raise
PendingDeprecationWarning; 1.3 should promote that to a
DeprecationWarning and 1.4 should remove the support entirely.

Anything and everything else being discussed is out of scope for 1.2
and must wait for the 1.3 feature proposal window.

> --
> You received this message because you are subscribed to the Google Groups "Django developers" group.
> To post to this group, send email to django-d...@googlegroups.com.
> To unsubscribe from this group, send email to django-develop...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
>
>

--
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

Rajiv Makhijani

unread,
Feb 22, 2010, 5:45:05 PM2/22/10
to django-d...@googlegroups.com
On 2/22/10 2:09 PM, James Bennett wrote:
> 1. Specifying memcached as the cache backend continues using the same
> "memcached://" scheme as it always has. There is no way we can change
> that in the 1.2 timeframe.
>
I concur in that this needs more thought before rushing a change like that.

> 2. The memcached backend in Django should look first for the correct
> library, and fall back to the old one as needed.
>
By this do you mean the order would be as such:
1) pylibmc
2) cmemcache
3) python-memcached

If pylibmc, the "correct library," is treated as the default, we should
also ensure that when pylibmc is used, its default hashing method is
also changed to be in concurrence with the two current libraries. (I
included sample code for this in my prior email). This will prevent the
change from breaking any existing configurations.

On 2/22/10 8:49 AM, Jeff Balogh wrote:
> I filed a bug on this last night, and it's fixed already:
> http://github.com/lericson/pylibmc/issues/issue/7
>

That's great to hear, thanks. So no need to work around that, but maybe
the Django documentation should warn of this issue for older versions if
pylibmc support is added as a default.


Mike Malone

unread,
Feb 22, 2010, 9:41:49 PM2/22/10
to django-d...@googlegroups.com
> At this point in the release process, I'm not sure we can do
> everything that's being talked about in this thread. Given that we're
> feature-frozen and that there's no way we can spring a completely new
> cache backend on people at the last minute, here's what's possible
> within our release process right now for 1.2:
>
> 1. Specifying memcached as the cache backend continues using the same
> "memcached://" scheme as it always has. There is no way we can change
> that in the 1.2 timeframe.
>
> 2. The memcached backend in Django should look first for the correct
> library, and fall back to the old one as needed.
>
> 3. When falling back to the old memcached library, 1.2 should raise
> PendingDeprecationWarning; 1.3 should promote that to a
> DeprecationWarning and 1.4 should remove the support entirely.
>
> Anything and everything else being discussed is out of scope for 1.2
> and must wait for the 1.3 feature proposal window.

Yea, at this point doing anything significant in 1.2 seems like a bad
idea. But, some of this stuff could potentially be done using query
string args to the cache backend URI, which would maintain backwards
compatibility. Without thinking too much about naming the arguments,
it'd look something like this:

CACHE_BACKEND =
"memcached://127.0.0.1:11211/?binding=pylibmc&enable_compression=True&min_compress_length=150000&fix_zero_timeout=True"

Still, I'm hesitant to rush into this sort of decision because I'd
hate to make the wrong one and then have to support it going forward.

Also, since the URI isn't known at the module level (it's passed into
CacheClass.__init__) we'd probably have to import the memcache client
in __init__ or something. Not the end of the world, but kind of messy.

> Mike's http://github.com/mmalone/django-caching has some good examples of fixing 2/3

FWIW, the "more correct" way to do this is to create your own custom
cache backend like this: http://gist.github.com/299905 then use the
full module path in the CACHE_BACKEND setting, like:

CACHE_BACKEND = 'foo.bar.custom_memcache://127.0.0.1:11211/'

Mike

Mat Clayton

unread,
Feb 23, 2010, 9:29:16 AM2/23/10
to django-d...@googlegroups.com
That makes much more sense, but is surely out of scope for 1.2, essentially I need slightly finer grain control over the cache and I suspect some others may do also, Django wont manage to cover everyone's edge cases, but it would be good to be able to still user some of the good points of django's caching code and be able to extend it as required. Wasn't expecting that any of these to be built, just thought I'd flag them as "hacks" we currently use on the normal caching system, to achieve what we need, as its in the same part of the code base.

Mat
 

Mike


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-d...@googlegroups.com.
To unsubscribe from this group, send email to django-develop...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.




--

Jacob Burch

unread,
Mar 1, 2010, 12:40:05 AM3/1/10
to Django developers
Thanks all for the helpful discussion here. From the sounds of thing,
my course of action will be:

1) Get a patch that throws a FutureDeprecationWarning when cmemcache
is used + Change of the docs to note the coming deprecation of
cmemcache
2) Begin working on a more involved patch that hopefully get slated in
for 1.3.

I'll throw 1) in on the cmemcache ticket (Ticket #12484 may be able to
be merged into it), and put my work for 2) into the pylibmc ticket
once I finish it.

On Feb 23, 6:29 am, Mat Clayton <m...@wakari.co.uk> wrote:

> > > Mike'shttp://github.com/mmalone/django-cachinghas some good examples


> > of fixing 2/3
>
> > FWIW, the "more correct" way to do this is to create your own custom

> > cache backend like this:http://gist.github.com/299905then use the


> > full module path in the CACHE_BACKEND setting, like:
>
> >    CACHE_BACKEND = 'foo.bar.custom_memcache://127.0.0.1:11211/'
>
> That makes much more sense, but is surely out of scope for 1.2, essentially
> I need slightly finer grain control over the cache and I suspect some others
> may do also, Django wont manage to cover everyone's edge cases, but it would
> be good to be able to still user some of the good points of django's caching
> code and be able to extend it as required. Wasn't expecting that any of
> these to be built, just thought I'd flag them as "hacks" we currently use on
> the normal caching system, to achieve what we need, as its in the same part
> of the code base.
>
> Mat
>
>
>
> > Mike
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Django developers" group.
> > To post to this group, send email to django-d...@googlegroups.com.
> > To unsubscribe from this group, send email to

> > django-develop...@googlegroups.com<django-developers%2Bunsubscr i...@googlegroups.com>


> > .
> > For more options, visit this group at
> >http://groups.google.com/group/django-developers?hl=en.
>
> --
> --
> Matthew Clayton | Founder/CEO
> Wakari Limited
>

> twitterhttp://www.twitter.com/matclayton

Jacob Kaplan-Moss

unread,
Mar 1, 2010, 10:27:25 AM3/1/10
to django-d...@googlegroups.com
On Sun, Feb 28, 2010 at 11:40 PM, Jacob Burch <jacob...@gmail.com> wrote:
> Thanks all for the helpful discussion here. From the sounds of thing,
> my course of action will be:
>
> 1) Get a patch that throws a FutureDeprecationWarning when cmemcache
> is used + Change of the docs to note the coming deprecation of
> cmemcache
> 2) Begin working on a more involved patch that hopefully get slated in
> for 1.3.

Yeah, that sounds right, and thanks.

Also, a step 2.5, if you'd like, would be to write a tiny app on pypi
that enabled the use of pylibmc via an external cache backend. We
could point to it in the deprecation notes when we explain why
cmemcache is being deprecated.

Jacob

mrts

unread,
Mar 16, 2010, 8:15:17 PM3/16/10
to Django developers
On Mar 1, 5:27 pm, Jacob Kaplan-Moss <ja...@jacobian.org> wrote:
> Also, a step 2.5, if you'd like, would be to write a tiny app on pypi
> that enabled the use of pylibmc via an external cache backend. We
> could point to it in the deprecation notes when we explain why
> cmemcache is being deprecated.

See http://gist.github.com/334682

Best,
MS

Reply all
Reply to author
Forward
0 new messages