[Agavi-Users] Finally: Agavi has Caching!

396 views
Skip to first unread message

David Zülke

unread,
Feb 7, 2007, 3:26:13 PM2/7/07
to Agavi Users Mailing List, Agavi Dev Mailing List
Hi guys,

sorry I didn't get to write this mail any sooner...

Agavi finally has caching!


It works like this:
- you create a Foo.xml in module/Blah/cache/ to make FooAction cacheable
- you put your settings and rules in there
- you lean back and enjoy the speedup


Now let's examine the options in detail.

First, these caching configs have the usual <configurations> and
<configuration> stuff at the top level. You should maybe enable
caching for "production" environment only since the caches the system
writes get thrown away in debug mode anyway.

Now the top element is <cachings>, which is, as most plural tags that
do not require an attribute, optional. In there we're getting to the
more exciting things: the <caching> element. Let's have a look:

<caching method="read" lifetime="2 hours" enabled="false">

The "method" attribute is optional. It may contain a space-separated
list of request method names that caching definition is valid for.
You could set up caching for "read" for a complex form, for example,
to speed things up a bit.

The "lifetime" attribute is optional, too. If omitted (omitted means
omitted, not lifetime="" !), the cache will be stored forever. You
can use any relative format allowed by the GNU date input formats
(http://www.gnu.org/software/tar/manual/html_node/
tar_115.html#SEC115, http://php.net/manual/en/
function.strtotime.php). Examples:
"1 day 6 hours"
"3 minutes 14 seconds"
Note that you can NOT use "thursday 02:00" and other formats, because
this would give the same day, and not the thursday of next week, on
thursdays after 02:00 o'clock. This is a strtotime limitation, but we
might add a fix for that in the future.

Last but not least the "enabled" attribute. Also optional, and
defaults to true.


Inside a <caching> block, you may define the following elements:
- <groups> (can be omitted) with <group> elements inside.
- <views> (can be omitted) with <view> elements containg names of
views to cache.
- <action_attributes> (can be omitted) with <action_attribute>
elements containing the names of action attributes to restore when
serving a page from cache, these will be available in the view's
initialize() method.
- <output_types> (can be omitted) with <output_type> elements that
define the layers, slots, request attributes and template variables
to cache.

Important: even when served from a cache, the action and view will
still be initialize()d! This is because a view's initialize() method
could change the container's output type. If you need an action
attribute for that, specify it using <action_attribute>.

WARNING: Be careful what you include in the cache. <action_attribute>
as well as <request_attribute> and <template_variable> (more on these
in a minute) can be used to include such items in a cache and restore
them when the cache is hit. You might need this in case of request
attributes to pass information to a global filter, for instance. But
always avoid to cache objects, especially models, propel rows etc.
The data is serialized, and in case of models, that would mean that
the entire context, and thus ALL objects in the framework are
included in the serialization. If you absolutely have to cache
objects, implement __sleep() and __wakeup() methods that remember the
context name and exclude the context itself from serialization.

Let's talk about the <view> elements first. If you don't specify any
<view>, all views will be allowed to be cached (<views></views>,
however, means NO views will be cached,, careful!). You can either
give the name of an action's view as you would when returning it from
the action, or the full name of a view including it's module:
<view>Success</view>
<view module="Admin">AddProductInput</view>

WARNING: You usually don't want to cache Error views, only Success.
Caching Error views would mean that attackers can quickly fill the
hard drive of your server by requesting random, invalid pages. You
know the drill.

But now to the most important element: the <group>. Groups work like
in Smarty, so I recommend you read http://smarty.php.net/manual/en/
caching.groups.php to understand the fundamental concept.

Groups may also have a source. You are not limited to a fixed string
as a group value. You can use
- request parameters (very useful and often needed)
- request attributes (also from namespaces)
- constants
- the current locale
- user parameters
- user attributes (also from namespaces)
- user authenticated status
- user credential (or, rather, if the user has the credential or not)

I think it's best to give an example here. Let's say we want to cache
the page for viewing a product (Default module, ViewProductAction).
Each product has an ID, passed in via the request parameter "id". You
have i18n in your app, so we need separate caches for each locale.
And finally, authenticated users with the credential "reseller" see a
special price, so we need different caches for these users:
<groups>
<group source="request_parameter">id</group>
<group source="locale" />
<group source="user_credential">reseller</group>
</groups>

For product ID "3", locale "de" and non-resellers, this would put the
cache into:

app/
cache/
content/
3/
de/
0/
Default_ViewProduct/

In reality, these directory names are base64-encoded. The last folder
will contain a file "4-8-15-16-23-42.cefcache" containing the action
information, and one file for each cached output type (e.g.
"html.cefcache").

Now, it is difficult to clear such a cache (more on that later). You
might want to cache categories. These have IDs, too. You'd get
collisions. Not good. Besides, you cannot easily clear the cache for
all products. So we should add another group at the beginning of the
list:
<group>products</group>

Again, remember that the wrapping <groups> element is optional.

That was easy, wasn't it?

But that doesn't cache anything yet. We need <output_type> elements,
too (yep, the <output_types> container is optional).

An <output_type> may have an optional "name" attribute to restrict
the rules inside to that output type (like the "methods" attribute on
<caching>, it can have a space-separated list of output type names).

Inside, the following elements may occur:
- <layers> (can be omitted) with a list of <layer> children defining
which layers to cache
- <request_attributes> (can be omitted) with a list of
<request_attribute> elements containing names of request attributes
to store in the cache
- <template_variables> (can be omitted) with a list of
<template_variable> elements holding names of template variables that
are stored in the cache (and then available to all layers that aren't
cached and thus rendered even on a cache hit

I'll explain <request_attribute> first:

<request_attribute
namespace="org.agavi.filter.FormPopulationFilter">populate</
request_attribute>

Will store the attribute "populate" from the namespace
"org.agavi.filter.FormPopulationFilter" and restore it when the cache
is read. In our example, this attribute would contain a
ParameterHolder object with fields to populate. Obviously, you should
only use that for the "read" request method to fill initial values
into your form. A bit of a stupid example, since you'll often have
default values (like "em...@example.com" or a checkbox pre-selected)
in the template itself, but who knows. You get the idea.

Next, layers. For this example, we assume that out current view has
two layers - "content" and "decorator".

To cache everything, you simply don't define any layers.

To cache only the "content" layer, you'd do
<layer name="content" />
Then this layer and all layers inside will be cached. The "decorator"
layer would still be rendered on each request.

That puts you into trouble - you have the "_title" template variable
you want to output in the decorator, in the html <title> element. But
since the view is not executed anymore when a cache hit occurs, we
have to tell the caching engine to store this template variable along
the content and then restore it before the decorator is rendered.
Easy task:
<template_variable>_title</template_variable>

Now back to the layers. Instead of
<layer name="content" />
you could also have done
<layer name="decorator" include="false" />

That would exclude the layer from the cache, and include all layers
inside. Surprise surprise, that is actually the better way of doing
it! Simple reason: your view _might_ insert another intermediate
layer (let's call it "wrapper") between "content" and "decorator".
That layer should be cached, too (at least we assume that here), but
setting the "content" layer as the cacheable layer wouldn't work for
that - only that layer would be cached. If we define "decorator" to
be the last layer before the cache kicks in, though, we'll achieve
that goal and have any layer inside "decorator" in the cache.

Now let's assume we have that setup, but we do NOT want the "wrapper"
layer in the cache. Then we could do:
<layer name="decorator" include="false" />
<layer name="wrapper" include="false" />

But now you might ask "hey isn't that stupid, why do I have to
declare them both as non-cacheble?", and you're right to insist:
<layer name="wrapper" include="false" />
does exactly the same thing.

BUT

What about slots?

Here's the nice thing. Obviously, slots set on a layer are included
in the layer's cache, since the whole layer output is cached, so the
slot output will already be included. But if you have include="false"
on a layer, then the slots in it will NOT be included in the cache,
since the layer is rendered each time.

But of course, we can include slots in a cache, even if their layer
is not cacheable:
<layer name="decorator">
<slot>menu</slot>
</layer>

This will include the "slot" layout in the cache

Note: I omitted the "include" attribute, since declaring slots inside
a <layer> automatically sets include to false (unless you provide it
and set it to "true", of course).

Note2: I didn't specify <slots>, but you can, if you like.

Note3: The order of layers in the configuration does not matter,
unlike in the layout configuration in output_types.xml.

And now the above example with duplicate layers makes sense again:
<layer name="decorator">
<slot>menu</slot>
</layer>
<layer name="wrapper" include="false" />

Of course, every layer can specify slots it wants to cache. Also, if
you do not want to cache a slot inside the calling cache, you can of
course set up a caching xml config for the slot action itself.

Tip: Agavi stores cookies with a lifetime, not with an expiry time.
Thus, you can cache actions/views that set a cookie on their
response, it's not a problem and works just fine! The same goes for
all other response headers you set. They are all included in the
cache and restored afterwards. You can even set a stream as the
resource content, e.g. $this->getResponse()->setContent(fopen('/path/
to/image.png', 'rb')); - that file will be re-opened when the cache
is read and, like all streams set in the response, output using
fpassthru() for maximum performance.

I hope that helps a bit. I'm sure there will be many questions from
now, and of course, caching will be covered extensively in the
documentation, on which we will focus from now on. Just shoot a mail
to the users list if find something confusing, or drop by on the IRC
channel.

Cheers,


David


P.S: yup, RC2 is really coming tonight! It's only a matter of hours now.

_______________________________________________
users mailing list
us...@lists.agavi.org
http://lists.agavi.org/mailman/listinfo/users

Reply all
Reply to author
Forward
0 new messages