This was sparked by this post by DHH:
http://37signals.com/svn/posts/3113-how-key-based-cache-expiration-works
At first I was excited to read about a new method I hadn't seen before, ActiveRecord's `cache_key`. It seemed like I was going to restructure our entire cache strategy to take advantage of this cool technique. However, I realized that, although much easier to maintain, it's much less performant than manually expiring cache keys. Also, it seems to only work okay with a very specific data structure (the one DHH is using in his post, for example).
I would very much like to use this technique and just be able to forget about manually expiring cache fragments for the most part. But there are a few things that are keeping me from moving in this direction. I want someone to read this and tell me why I'm wrong, and why auto expiring keys is definitely the best way to go.
A little context: The website I work for gets an average of about 30,000 visits per day - not a ton but definitely enough that little things make a big difference in performance.
**TL;DR** : This technique requires too many queries and too many renders. Manually expiring gives us the ability to cache larger chunks of data. I am looking for opinions, thoughts, and especially arguments on this.
Consider this example, where I'd like to display a list of blogs and each blog's 5 most recent posts:
** blogs/index.html.erb**
1. <% @blogs.each do |blog| %>
2. <% cache blog do %>
3. <%= render partial: "posts/post", collection: blog.posts.recent.limit(5) %>
4. <% end %>
5. <% end %>
**posts/_post.html.erb**
1. <% cache post do %>
2. <h2><%= post.title %></h2>
3. <p><%= post.body %></p>
4. <% end %>
Line 1 will perform a database query no matter what, on every page load. It also requires several hits to the cache database to check for every blog's `cache_key`.
If any post in a blog is updated, that block will be required to render the post partial 5 times, no matter what. It will also have to fire off a query to the database to retrieve those 5 posts. At this point - with the 5 posts loaded in to memory, and the partials being rendered anyways - what is really the performance difference between fetching the HTML fragment for that post from cache, or just rendering the partial as usual? My guess is that it's negligible, but I hope that I am wrong.
Consider this example. I want to simply render the 5 most recent posts made, regardless of which blog:
** posts/recent.html.erb**
1. <% @posts = Post.recent.limit(5) %>
2. <% cache @posts do %>
2. <%= render @posts %>
3. <% end %>
Same situation here: By calling `cache @posts`, we're firing off that query, therefore defeating one of the awesome advantages of an ActiveRecord::Relation - lazy queries. And then we have to render the `post` partial 5 times, and at that point, with the post ready to go, is caching really going to help that much?
Auto-expiring keys doesn't support arbitrary view fragments - i.e., fragments of HTML that aren't tied to any model object:
**posts/recent.html.erb**
1. <% cache "recent_posts" do %>
2. <% @posts = Post.recent.limit(5) %>
3. <%= render @posts %>
4. <% end %>
This method (on cache hit):
* Will not perform any database queries
* Doesn't need to instantiate an ActiveRecord::Relation object
* Doesn't render any partials
* Only needs to check the cache for a single key
The only downside, of course, is that the cache needs to be manually expired - but that's, what, 5 lines in an observer?
**post_observer.rb**
1. class PostObserver < ActiveRecord::Observer
2. def after_save(post)
3. ActionController::Base.new.expire_fragment "views/recent_posts"
4. end
5. end
Of course, if you have a lot of places where this object is being represented, you'd have to expire several fragments. But, with redis, you can take advantage of `sets` and `smembers` to do that.
auto expiring keys also require extra writing to the database to update associated objects (such as a Blog) when a Post is saved.
So - thoughts?