Resetting cache gracefully at the end of every test?

2,117 views
Skip to first unread message

Jim D.

unread,
Jul 1, 2011, 11:54:17 AM7/1/11
to django-d...@googlegroups.com
This issue came up again for me recently: Because the cache is not reset or flushed after every test, cache values linger there. This can create headaches when writing tests.

I would like to propose that we treat the cache backend(s) like database backends during the test run. Specifically, I propose that we:

* Prefix "test_" before the KEY_PREFIX in each backend during global setup. This is equivalent to adding the test_ prefix to DB names. In doing this we prevent cache calls during test form overwriting possible legitimate values.
* Monkey patch cache.set() (and cache.set_many()) such that we record all keys used to set values during a test.
* Use cache.delete_many() on the list of keys we record being set during test teardown. This will restore the cache to its previous state after each test run. It's better than flush() which clears the whole cache and is too destructive to be run during testing.

I don't think it needs to be more complicated than this.

I checked around to see if anyone had opened a ticket related to this or had previous discussions on the subject but couldn't find anything. I did briefly touch upon this with Russ in a comment thread on https://code.djangoproject.com/ticket/15026 and I noticed he made a comment in this vein on Django Users: https://groups.google.com/forum/#!topic/django-users/OOGO-3MIO_c . If there is something already open on this or a decision has already been made, please do feel free to point me in the right direction.

Otherwise, if there is interest I'd be happy to open a ticket and get a patch started.

Thanks

Jim

Jeremy Dunck

unread,
Jul 1, 2011, 1:01:40 PM7/1/11
to django-d...@googlegroups.com
On Fri, Jul 1, 2011 at 10:54 AM, Jim D. <jim.d...@gmail.com> wrote:
> This issue came up again for me recently: Because the cache is not reset or
> flushed after every test, cache values linger there. This can create
> headaches when writing tests.
> I would like to propose that we treat the cache backend(s) like database
> backends during the test run. Specifically, I propose that we:
> * Prefix "test_" before the KEY_PREFIX in each backend during global setup.
> This is equivalent to adding the test_ prefix to DB names. In doing this we
> prevent cache calls during test form overwriting possible legitimate values.
> * Monkey patch cache.set() (and cache.set_many()) such that we record all
> keys used to set values during a test.
> * Use cache.delete_many() on the list of keys we record being set during
> test teardown. This will restore the cache to its previous state after each
> test run. It's better than flush() which clears the whole cache and is too
> destructive to be run during testing.
> I don't think it needs to be more complicated than this.

Well, I think you forgot all the other low-level cache options, like
.incr, .add, etc.?

If I were to do this, I wouldn't have cache shared between tests or
environments, so I'd be tempted to use flush.

Rather than "test_" and bookkeeping on key names, the key prefix was
the hostname + timestamp from the beginning of the run, so that test
keys would never collide?

Or, what if we just had hooks called from setUp and tearDown, either
NotImplemented methods, or signals?

Book-keeping with monkey-patched methods sounds fiddly to me; consider
the case where celery pokes keys in, but our bookkeeping doesn't know
about it?

Jim Dalton

unread,
Jul 1, 2011, 1:27:59 PM7/1/11
to django-d...@googlegroups.com
Awesome feedback, thanks.

On Jul 1, 2011, at 10:01 AM, Jeremy Dunck wrote:

> Well, I think you forgot all the other low-level cache options, like
> .incr, .add, etc.?

Yep, forgot those so they would conceivably need to be tracked.

>
> If I were to do this, I wouldn't have cache shared between tests or
> environments, so I'd be tempted to use flush.

I'm not totally sure if I'm following you on this point?

>
> Rather than "test_" and bookkeeping on key names, the key prefix was
> the hostname + timestamp from the beginning of the run, so that test
> keys would never collide?

What I like about this idea is it avoids the bookkeeping and monkey-patching nonsense. What I don't like about it is that it feels like you could potentially be adding a lot of junk to your cache. A test run might end up hitting it with many values, each unique. A requirement I'm sort of working with in my mind here is that the cache returns to the "pristine" state it was in prior to the test run.

>
> Or, what if we just had hooks called from setUp and tearDown, either
> NotImplemented methods, or signals?
>
> Book-keeping with monkey-patched methods sounds fiddly to me; consider
> the case where celery pokes keys in, but our bookkeeping doesn't know
> about it?

I know what you mean about it being fiddly, but I'm kind of envisioning something not all that different from what goes on with Template.render() settings.EMAIL_BACKEND, or the database backends during test setup and teardown. The bookkeeping required would in truth be nothing more than a set() where you add keys. The monkey patching would entail nothing more than a line to add the passed keys to that set and then a call to the original function. I'm just not sure it's much more fiddly than the fiddling we already do for those examples is what I'm saying.

I"m not quite sure what you mean about celery poking keys in. Maybe you can elaborate a bit? To be clear I'm not sure we could keep the actual underling cache entirely pristine if you were using it in different ways from different systems; the goal would really be to ensure calls using the cache backend API are rolled back with every test.

Anyhow thanks once again for the feedback and the critical thoughts.

Jeremy Dunck

unread,
Jul 1, 2011, 5:04:48 PM7/1/11
to django-d...@googlegroups.com
On Fri, Jul 1, 2011 at 12:27 PM, Jim Dalton <jim.d...@gmail.com> wrote:
> Awesome feedback, thanks.
>
> On Jul 1, 2011, at 10:01 AM, Jeremy Dunck wrote:
>
...

>> If I were to do this, I wouldn't have cache shared between tests or
>> environments, so I'd be tempted to use flush.
>
> I'm not totally sure if I'm following you on this point?

What I mean is, the scenario you described doesn't affect me, because
flushing rather than deleting specific keys would be fine to me. I
guess, generally, I meant to point out that presuming your scenario
applies to the community is a bad point to start from. More to the
point, a solution for core should be either an extension point, or a
solution that all can agree is good commonly.

>> Rather than "test_" and bookkeeping on key names, the key prefix was
>> the hostname + timestamp from the beginning of the run, so that test
>> keys would never collide?
>
> What I like about this idea is it avoids the bookkeeping and monkey-patching nonsense. What I don't like about it is that it feels like you could potentially be adding a lot of junk to your cache. A test run might end up hitting it with many values, each unique. A requirement I'm sort of working with in my mind here is that the cache returns to the "pristine" state it was in prior to the test run.
>

That's true about collecting garbage. It would matter for filesystem
cache, I guess. I don't much care about that, since cache for me is
either memcache (which dumps LRU entries) or dummy. As far as
returning to "pristine", hmm. If that's the case, I think you're
arguing for setUp and tearDown to do that setup.

>> Or, what if we just had hooks called from setUp and tearDown, either
>> NotImplemented methods, or signals?
>>
>> Book-keeping with monkey-patched methods sounds fiddly to me; consider
>> the case where celery pokes keys in, but our bookkeeping doesn't know
>> about it?
>

...


> I"m not quite sure what you mean about celery poking keys in. Maybe you can elaborate a bit? To be clear I'm not sure we could keep the actual underling cache entirely pristine if you were using it in different ways from different systems; the goal would really be to ensure calls using the cache backend API are rolled back with every test.

What I meant to point out is that the test process, which is doing the
book-keeping, may not have visibility to all of the key cleanup that's
needed; a common use case of that is that celery (an out-of-process
task queue commonly used with Django) does some work and pushes some
data into cache in response to an async. I'm arguing that if you want
a clean workspace, how you get there depends on both how keys are
generated and what the cache backend is.

Jim Dalton

unread,
Jul 1, 2011, 5:47:31 PM7/1/11
to django-d...@googlegroups.com
On Jul 1, 2011, at 2:04 PM, Jeremy Dunck wrote:

> What I mean is, the scenario you described doesn't affect me, because
> flushing rather than deleting specific keys would be fine to me. I
> guess, generally, I meant to point out that presuming your scenario
> applies to the community is a bad point to start from. More to the
> point, a solution for core should be either an extension point, or a
> solution that all can agree is good commonly.

Thanks for expanding on this point, I understand now. Just to clarify, I don't presume my scenario is what the community wants -- like you I actually don't care that much about calling cache.flush() where I run tests. I just wanted to throw out some concrete ideas for requirements so that there was a starting point for discussion.

That said, I am certainly still arguing for this requirement. I believe as a general principle that a testing framework should always return the system under test to the state it was in prior to the run. Calling cache.flush() violates that principle -- and of principle concern to me, it flushes your *entire* cache. That means if someone executed tests in production it could kill an entire cache, a situation which could be slightly annoying or a horrific headache depending.

Of course, if no one else agrees with me that this is a worthy principle for Django tests then I won't waste anyone's time by undertaking a patch :)


> What I meant to point out is that the test process, which is doing the
> book-keeping, may not have visibility to all of the key cleanup that's
> needed; a common use case of that is that celery (an out-of-process
> task queue commonly used with Django) does some work and pushes some
> data into cache in response to an async. I'm arguing that if you want
> a clean workspace, how you get there depends on both how keys are
> generated and what the cache backend is.

Okay, understood for the most part. It's hard for me to tell without having gotten my hands dirty yet how much that will matter in the context of running tests.

But to me it's kind of like sending email. There are plenty of ways to send mail from Python, and your application might very well send actual email out during a test run if e.g. you use an alternate email API of some kind. Django's test suite makes no promise that *no* email will be sent when you run tests, it only promises that if you use the standard email backend API it'll be replaced with a dummy backend -- and that no email will be sent if you use Django's provided API.

Likewise, I'd be doubtful that we could guarantee with certainty that running the tests would not affect the cache at all. What I would hope to accomplish was that if your application called cache or get_cache, it would get a modified backend that deleted any entries you inserted after each and every test run. If your system touches a cache via other means then you're sort of on your own (which of course is the status quo right now for all of the cache).

Just to reiterate, I really do appreciate your taking the time to get back to me with this feedback. Thanks!

Cheers,
Jim

Souen Boniface

unread,
Jul 3, 2011, 2:03:05 PM7/3/11
to django-d...@googlegroups.com
I believe as a general principle that a testing framework should always return the system under test to the state it was in prior to the run. Calling cache.flush() violates that principle -- and of principle concern to me, it flushes your *entire* cache. That means if someone executed tests in production it could kill an entire cache, a situation which could be slightly annoying or a horrific headache depending.

Of course, if no one else agrees with me that this is a worthy principle for Django tests then I won't waste anyone's time by undertaking a patch :)

I'd like to +1 on the whole idea.

Making the testing framework use a special key prefix when saving/getting values from cache during test run (a key prefix such as "test_", which is equivalent to adding the "test_" prefix for DB names, as pointed out by Jim) and reset all records saved in cache during test run would not only ensure that tests are not polluting "real cache" with dummy values and are not altering legitimate values in cache, it would also ensure that tests are run within a pristine cache. That is, it would prevent tests from interacting with some cache values which would have been saved outside the test run (during normal system run or during a previous test run, for instance), i.e. it would prevent tests from being sensitive to cache state.

Jim Dalton

unread,
Jul 4, 2011, 9:54:16 AM7/4/11
to django-d...@googlegroups.com
I've created a ticket for this and uploaded a patch: https://code.djangoproject.com/ticket/16401

Please feel free to review and let me know if this is a good idea in the first place, as well as if the patch makes sense and is the right approach.

Thanks,
Jim

Jacob Kaplan-Moss

unread,
Jul 4, 2011, 11:01:41 AM7/4/11
to django-d...@googlegroups.com
On Mon, Jul 4, 2011 at 8:54 AM, Jim Dalton <jim.d...@gmail.com> wrote:
> I've created a ticket for this and uploaded a patch: https://code.djangoproject.com/ticket/16401
> Please feel free to review and let me know if this is a good idea in the first place, as well as if the patch makes sense and is the right approach.

Hm.

While I think the feature's a good idea (I've been stung by tests
running against a live cache) I can't say I'm particularly thrilled by
the approach here. Monkeypatching is acceptable in situations like
these, but there's a limit... and I think patching five methods is
starting to reach it. Remember: as much as possible, tests should
operate just like production. Any difference, no matter how small,
introduces the worst kind of bugs: ones that only show up in
production.

I have an alternate proposal:

Instead of using the default cache defined in CACHES, use a "test
cache." This can safely be flushed, meaning that no monkeypatching is
needed. The added benefit here is that this works similarly to the
database engine, so it'll be fairly easy to understand.

Now, unlike DB test we can't just prefix "test_" to the database name
(because we haven't got one), nor can we prefix keys (because not all
cache backends can do something like "delete test:*"). So I'd suggest
we look for a "test" backend defined in CACHES and fall back to a
local memory cache if not defined.

I think this simplifies things immensely, and also gives a fair bit of
control -- if I want to test against a real cache backend, I just
define CACHES['test'] and roll.

Jacob

Jim Dalton

unread,
Jul 4, 2011, 5:54:18 PM7/4/11
to django-d...@googlegroups.com
On Jul 4, 2011, at 8:01 AM, Jacob Kaplan-Moss wrote:

> On Mon, Jul 4, 2011 at 8:54 AM, Jim Dalton <jim.d...@gmail.com> wrote:
>> I've created a ticket for this and uploaded a patch: https://code.djangoproject.com/ticket/16401
>> Please feel free to review and let me know if this is a good idea in the first place, as well as if the patch makes sense and is the right approach.
>
> Hm.
>
> While I think the feature's a good idea (I've been stung by tests
> running against a live cache) I can't say I'm particularly thrilled by
> the approach here. Monkeypatching is acceptable in situations like
> these, but there's a limit... and I think patching five methods is
> starting to reach it. Remember: as much as possible, tests should
> operate just like production. Any difference, no matter how small,
> introduces the worst kind of bugs: ones that only show up in
> production.

Yeah, I agree with you in principle there. I'm definitely an advocate of avoiding monkey patching like the plague in production code.

While I also agree that this code looks a bit hairy when you sort of squint your eyes at it, if you actually look closely at what's going on, we're not modifying things in any way that's tricky or complicated. Really, the only two things we are doing are adding a bit of code that that tracks keys and then swapping a few methods out. Pretty similar to what we do already with Template.render -- just x5. The monkey patching stuff is not modifying any arguments or results, nor is it changing the observable behavior of the modified methods. The tests demonstrate this pretty thoroughly.

The only place I've managed to think of where this is different from production is in the length of the key. Because we are appending "_test_" to the start of KEY_PREFIX, you're going to get 6 less characters in your key than you would otherwise. It *is* a difference, so I don't want to downplay it, but at the same time it's a difference where you'll get an error in testing that you wouldn't get in production -- which is better than the opposite.

The only other place where someone might run into trouble is if they themselves were monkey patching the cache and their changes collided somehow. I understand that this is the kind of thing that happens in Ruby a lot, and it strikes me as unpleasant.

Anyhow, that's not to say that there isn't a better solution or that your proposal below isn't preferable (more on that in a sec), but I'd argue that my patch is at least a plausible solution, i.e. it works in lieu of a better alternative.

> I have an alternate proposal:
>
> Instead of using the default cache defined in CACHES, use a "test
> cache." This can safely be flushed, meaning that no monkeypatching is
> needed. The added benefit here is that this works similarly to the
> database engine, so it'll be fairly easy to understand.
>
> Now, unlike DB test we can't just prefix "test_" to the database name
> (because we haven't got one), nor can we prefix keys (because not all
> cache backends can do something like "delete test:*"). So I'd suggest
> we look for a "test" backend defined in CACHES and fall back to a
> local memory cache if not defined.
>
> I think this simplifies things immensely, and also gives a fair bit of
> control -- if I want to test against a real cache backend, I just
> define CACHES['test'] and roll.


So, in many ways I like this idea. I agree that the implementation is going to be more straightforward. Here's just a quick list of concerns or possible drawbacks I was able to think of:

* Not sure how to handle multiple cache backends. I imagine if you're someone who has implemented multiple cache backends, then you are using them in specific ways for specific reasons. If your tests end up sharing a test backend, that could result in unpredictable behavior. One possible resolution would be to amend your proposal a bit: For each cache backend defined, we swap it out with the locmem backend unless you have yourself defined a test_CACHE_NAME version of that cache. So e.g. fi I have three backends named "default", "file" and "db", I then need to define test_default, test_file and test_db if I want to specify how the test framework is going to set up the test cache. Otherwise, it'll fall back to replacing it with a locmem cache.
* At least when it comes to memcached, this doesn't help you much with the cache.clear() issue, since cache.clear() will blank your entire memcached cache. So if we're calling cache.clear() at the end of each test and it's blanking the whole cache, I start to wonder why we bother at all? In other words, why not just call cache.clear() at the end of every test and call it a day, and make a note in the documentation that running the test will clear any cache you have defined in BACKENDS, so use it with caution. One response is that the same thing is not true for db, file and locmem -- in these cases you'll only be deleting the test cache which was set up for them and leaving everything else intact. But IDK, I mean I get the sense that memcached is what most people are using for caching in production, so to me it's far and away the most import cache to consider.

Those were my major concerns, though I'm not sure any of them are fatal at least from my perspective.

At present our choices boil down to:

1. Monkey patch cache, keep track of set keys, flush set keys at the end of each test.
2. Default to locmem cache, use a 'test' cache (or possibly 'test_CACHE_NAME') instead if found. cache.clear() on test after each test.
3. Just cache.clear() after every test, note in docs, call it a day.
4. Do nothing i.e. keep status quo.

Speaking personally, any of 1-3 are better than #4. I also wonder if #3 is worth considering. It's certainly the easiest, and part of me wonders if objections to it are purely hypothetical.

Anyhow, thanks for giving me your feedback Jacob and obviously thanks for your alternative proposal. I'd be happy to pitch in and work on that idea if that's what you guys all lean toward. In the meantime I'd be interested in other opinions as well.

Cheers,
Jim

Reply all
Reply to author
Forward
0 new messages