support for http purge?

Sven Fuchs

unread,

Aug 30, 2009, 2:07:29 PM8/30/09

to rack-...@googlegroups.com

Looking for purge support for Rack::Cache I've found this thread today:

http://groups.google.com/group/rack-cache/browse_thread/thread/55ffd7919b03a7b0

I've sent a message to Pat but haven't heard back yet, so I though I'd
just go ahead and play a bit.

As far as I understand http purge support in Varnish it simply accepts
a non-standard PURGE verb and then purges the cache for the given URL.

I've added something like that here:

http://github.com/svenfuchs/rack-cache/commit/95415486df1370db140f5ee84a1bf63819286a2b

I've also been thinking about an X-Cache-Purge header that allows the
application to instruct Rack::Cache to purge arbitrary cache entries.
E.g. when a blog article is displayed on the blog index page the
application could use this header to expire both /blog in response to
PUT to /blog/articles/1 (or something like that, you get the idea).

Then again this wouldn't necessarily need to go into Rack::Cache
itself. As Rack::Cache is passing the metastore instance along with
the request object another middleware layer could look for this header
and purge stuff itself.

Any ideas?

Ryan Tomayko

unread,

Sep 2, 2009, 2:17:08 AM9/2/09

to rack-...@googlegroups.com

On Sun, Aug 30, 2009 at 11:07 AM, Sven Fuchs<sven...@artweb-design.de> wrote:
> Looking for purge support for Rack::Cache I've found this thread today:
>
> http://groups.google.com/group/rack-cache/browse_thread/thread/55ffd7919b03a7b0
>
> I've sent a message to Pat but haven't heard back yet, so I though I'd
> just go ahead and play a bit.
>
> As far as I understand http purge support in Varnish it simply accepts
> a non-standard PURGE verb and then purges the cache for the given URL.
>
> I've added something like that here:
>
> http://github.com/svenfuchs/rack-cache/commit/95415486df1370db140f5ee84a1bf63819286a2b

I like it. We'd need to put some kind of access control on it, like
only allowing PURGE from localhost. Or maybe we punt and make that the
web servers job (i.e., you'd have to configure nginx to not allow
PURGE through).

This is definitely an oft requested feature. I haven't been able to
settle on whether to use something like what you've done here or to
provide an API for manually purging / invalidating entries. e.g.,
Rack::Cache would add some object to the Rack env that provided a
"purge" method so downstream apps could do something like:

env['rack-cache.thing'].purge "/foo", "/bar"

Both approaches have pros and cons.

> I've also been thinking about an X-Cache-Purge header that allows the
> application to instruct Rack::Cache to purge arbitrary cache entries.
> E.g. when a blog article is displayed on the blog index page the
> application could use this header to expire both /blog in response to
> PUT to /blog/articles/1 (or something like that, you get the idea).

Yep. Another good approach. This doesn't have the issues with access
control so I think it would be easier to accept something and get it
out in a release.

> Then again this wouldn't necessarily need to go into Rack::Cache
> itself. As Rack::Cache is passing the metastore instance along with
> the request object another middleware layer could look for this header
> and purge stuff itself.
>
> Any ideas?

These are all very feasible. I'd love to hear how PURGE is working for
you - it's probably my least favorite of the approaches you've laid
out here. I guess we should just pick something and go with it. I was
worried that there'd end up being too many ways of purging but that's
just blocking progress at this point.

I plan on spending some time getting a new release together within the
next couple of weeks. Some kind of manual purge is going to be a part
of it. I appreciate the patch. Do share any other experiments.

Thanks,
Ryan

Sven Fuchs

unread,

Sep 2, 2009, 6:18:37 AM9/2/09

to rack-...@googlegroups.com

Hi Ryan,

On 02.09.2009, at 08:17, Ryan Tomayko wrote:
>> http://github.com/svenfuchs/rack-cache/commit/95415486df1370db140f5ee84a1bf63819286a2b
>
> I like it.

Nice :)

> We'd need to put some kind of access control on it, like
> only allowing PURGE from localhost. Or maybe we punt and make that the
> web servers job (i.e., you'd have to configure nginx to not allow
> PURGE through).

Initially I thought it's just up to the server config to disallow
this. But then again rack-cache should probably ship in a state that's
secure without any further (potentially advanced, from the user's pov)
server configuration.

Maybe the easiest way forward would be to disallow PURGE by default,
have developers allow it based on a rack-cache config option and tell
them they also need to take care of security themselves in their
webservers.

Something like

use Rack::Cache, :allow_purge => true

> This is definitely an oft requested feature. I haven't been able to
> settle on whether to use something like what you've done here or to
> provide an API for manually purging / invalidating entries. e.g.,
> Rack::Cache would add some object to the Rack env that provided a
> "purge" method so downstream apps could do something like:
>
> env['rack-cache.thing'].purge "/foo", "/bar"

Nice idea!

Although this could happen downstream in some extra (rack-cache
targetted) middleware I guess. Unless I am missing something one can
always use something like

uri = @env['rack-cache.metastore']
storage = Rack::Cache::Storage.instance
storage.resolve_metastore_uri(uri)

to access the storages and then purge?

> Both approaches have pros and cons.
>
>> I've also been thinking about an X-Cache-Purge header that allows the
>> application to instruct Rack::Cache to purge arbitrary cache entries.
>> E.g. when a blog article is displayed on the blog index page the
>> application could use this header to expire both /blog in response to
>> PUT to /blog/articles/1 (or something like that, you get the idea).
>
> Yep. Another good approach. This doesn't have the issues with access
> control so I think it would be easier to accept something and get it
> out in a release.
>
>> Then again this wouldn't necessarily need to go into Rack::Cache
>> itself. As Rack::Cache is passing the metastore instance along with
>> the request object another middleware layer could look for this
>> header
>> and purge stuff itself.
>>
>> Any ideas?
>
> These are all very feasible. I'd love to hear how PURGE is working for
> you - it's probably my least favorite of the approaches you've laid
> out here.

To be honest so far I've really only played. From my tests it seems to
work fine though.

We have a fairly big application still using Rails page caching plus
some custom expiration funkyness and we intend to replace this with
rack-cache soon though (which was my initial motivation to look into
this).

> I guess we should just pick something and go with it. I was
> worried that there'd end up being too many ways of purging but that's
> just blocking progress at this point.
>
> I plan on spending some time getting a new release together within the
> next couple of weeks. Some kind of manual purge is going to be a part
> of it. I appreciate the patch. Do share any other experiments.

So, there are at least three approaches and, as you said, they all
have different pros and cons. Personally I'd provide support for all
of them - for that very reason. Although I'd probably only include the
most generic one to rack-cache itself.

I see the options like this:

- HTTP PURGE, most "standard" way, same thing as Varnish is doing,
really simple, requires going through HTTP from upstream though,
causing some overhead
- X-Cache-Purge Header is also rather a "standard" way of doing
things, small overhead for passing/recovering the header
- env[rack.purge-thing] implies a tight coupeling of client logic to
rack-cache, probably the fasted/cheapest way though

If I were to decide this, I'd tend to only include HTTP PURGE to rack-
cache. It's the only method that only could be implemented in an extra
middleware layer that sits on top/upstream of rack-cache without
causing any extra overhead. It actually could sit in a middleware
downstream from rack-cache but that would cause an extra invalidate
call (maybe not a that big deal though).

All of them could be implemented in an extra layer rack-cache-purge
which looks up and uses the rack-cache storage instances. This layer
could be seen as a playground or place for experiments with rack-cache
purge support and as soon as any of it proves useful and stable you
could re-evaluate including it directly to rack-cache. This is kind of
the way how we do it with Rails/I18n. We encourage people to try
features in plugin-land before we consider including them to the I18n
gem or Rails. Not sure whether rack-cache needs a that defensive
strategy in general but I'd think it could apply well in this case.

Oh, and thanks for an beautiful piece of software btw. :)

Sven Fuchs

unread,

Sep 2, 2009, 6:27:57 AM9/2/09

to rack-...@googlegroups.com

I played with some rack-cache-purge middleware over the weekend and
now pushed it here:

http://github.com/svenfuchs/rack-cache-purge

Ryan Tomayko

unread,

Sep 5, 2009, 4:05:48 AM9/5/09

to rack-...@googlegroups.com

On Wed, Sep 2, 2009 at 3:18 AM, Sven Fuchs<sven...@artweb-design.de> wrote:
> Hi Ryan,
>
> On 02.09.2009, at 08:17, Ryan Tomayko wrote:
>>> http://github.com/svenfuchs/rack-cache/commit/95415486df1370db140f5ee84a1bf63819286a2b
>>
>> I like it.
>
> Nice :)
>
>> We'd need to put some kind of access control on it, like
>> only allowing PURGE from localhost. Or maybe we punt and make that the
>> web servers job (i.e., you'd have to configure nginx to not allow
>> PURGE through).
>
> Initially I thought it's just up to the server config to disallow
> this. But then again rack-cache should probably ship in a state that's
> secure without any further (potentially advanced, from the user's pov)
> server configuration.
>
> Maybe the easiest way forward would be to disallow PURGE by default,
> have developers allow it based on a rack-cache config option and tell
> them they also need to take care of security themselves in their
> webservers.
>
> Something like
>
> use Rack::Cache, :allow_purge => true

I like it. I think I'll just merge your previous patch and add this
option, disabled by default as you have here.

>> This is definitely an oft requested feature. I haven't been able to
>> settle on whether to use something like what you've done here or to
>> provide an API for manually purging / invalidating entries. e.g.,
>> Rack::Cache would add some object to the Rack env that provided a
>> "purge" method so downstream apps could do something like:
>>
>> env['rack-cache.thing'].purge "/foo", "/bar"
>
> Nice idea!
>
> Although this could happen downstream in some extra (rack-cache
> targetted) middleware I guess. Unless I am missing something one can
> always use something like
>
> uri = @env['rack-cache.metastore']
> storage = Rack::Cache::Storage.instance
> storage.resolve_metastore_uri(uri)
>
> to access the storages and then purge?

Yeah. I've sent that exact code snippet out to a few people recently
who were wanting to do things like manual purge or check if there was
anything in the cache for a given URL. I'd like to make it a little
less convoluted and maybe provide a separate, documented interface for
doing these kinds of operations in downstream apps / middleware. The
metastore interface is kind of a PITA to deal with. It should be
possible to put an object in the env that implements the following
without too much effort:

- purge(*keys)
- flush_all
- fresh?(key)
- valid?(key)

That should satisfy a big majority of the manual purge / cache
introspection cases I've seen in the wild. It would also make it a tad
easier to implement things like X-Cache-Purge in a downstream
middleware.

Oh wow. I'm looking at rack-cache-purge now and it appears you have
almost all of this stuff implemented. This all looks really fairly
solid to me. I think we should start working on integrating it into
rack-cache proper. I'll try to experiment with a more generic cache
manipulation/introspection object (as opposed to rack-cache.purger)
this weekend and then start bringing most of what you have over if
that's cool with you.

> Oh, and thanks for an beautiful piece of software btw. :)

Thanks for making it more beautiful :)

Ryan

Sven Fuchs

unread,

Sep 5, 2009, 4:59:02 AM9/5/09

to rack-...@googlegroups.com

> metastore interface is kind of a PITA to deal with. It should be
> possible to put an object in the env that implements the following
> without too much effort:
>
> - purge(*keys)
> - flush_all
> - fresh?(key)
> - valid?(key)
>
> That should satisfy a big majority of the manual purge / cache
> introspection cases I've seen in the wild. It would also make it a tad
> easier to implement things like X-Cache-Purge in a downstream
> middleware.

Sounds good to me.

> Oh wow. I'm looking at rack-cache-purge now and it appears you have
> almost all of this stuff implemented. This all looks really fairly
> solid to me. I think we should start working on integrating it into
> rack-cache proper. I'll try to experiment with a more generic cache
> manipulation/introspection object (as opposed to rack-cache.purger)
> this weekend and then start bringing most of what you have over if
> that's cool with you.

Absolutely, that's why it's there :)

Let me list a few things that I've encountered trying to extend
Rack::Cache.

- It would be great to be able to reuse Rack::Cache::Key.
Unfortunlately it only accepts a Request object. Thus, when I only
have a URI (as in X-Cache-Purge) I'd need to instantiate a Request
object just for getting the key. I've tried to extract that logic to a
more reusable Rack::Cache::Tools::Key class, see http://github.com/svenfuchs/rack-cache-purge/blob/master/lib/rack/cache/tools/key.rb

- I believe there's also a bug in Rack::Cache::Key that makes it
always append a question mark to the key when the query_string part is
an empty string: http://github.com/svenfuchs/rack-cache/blob/master/lib/rack/cache/key.rb#L43
... maybe that should be

@request.query_string.nil? && request.query_string.empty?

- I've also tried to extract Rack::Cache::Tools::Options so that I
could reuse that logic in other middlewares, too (although it's not
really Cache related). See http://github.com/svenfuchs/rack-cache-purge/blob/master/lib/rack/cache/tools/options.rb
. I have the feeling this probably could be further improved though.
E.g. line #33 looks weird to me (http://github.com/svenfuchs/rack-cache-purge/blob/master/lib/rack/cache/tools/options.rb#L33
) and might be the reason why I have to set stuff to the environment
manually? (e.g. http://github.com/svenfuchs/rack-cache-purge/blob/159483cbcb420e89af7340d49bb976ad6c66d881/lib/rack/cache/purge/context.rb#L23)

One other thing:

My background is that, as I said, I want to use Rack::Cache to extract
caching from an application that's using pagecaching. We're also using
some custom code that allows us to "tag" cache entries so that we can
expire cache entries based on tags. That is helpful in order to
decouple knowledge about cache entries from expiration logic. E.g.
when blog/article/1 displays user-1's name, it would be tagged
"user-1". On PUT to user/1 the application would just expire all
entries tagged "user-1" (as opposed to having to know about blog/
article/1).

I've thus tried to extend metastore so that I could pass X-Cache-Tags
headers and store/purge these alongside with entries. A tagstorage
itself is trivial but it's not that easy to extend metastore without
wild monkeypatches.

One thing that comes to mind here would be callbacks on storages like
after_read, after_store, after_purge or something like that. Another
pattern could be to move all the logic to base classes or modules and
extend them with empty concrete classes. That would allow users to
inject modules to Ruby's method lookup chain which could extend
methods like store, fetch, purge etc this way. E.g. http://gist.github.com/181350

Wdyt?

Reply all

Reply to author

Forward