[2.0] [Scala] Using sbt code generators to generate hashes of assets to optimize caching

425 views
Skip to first unread message

Drew Kutcharian

unread,
Apr 10, 2012, 2:48:02 AM4/10/12
to play-framework
Hi All,

I've been thinking about generating a class using sbt that contains a map that contains the md5 hashes of contents of each asset file (css or js) keyed by the path+name of the asset during build time.

I think this should be effective since the contents don't change after a build. This way I can append that hash as a url parameter to the end of the asset url and/or use it as an etag to optimize proxy and browser caching.

I think this can be better than the Last Modified Time since effctively every time you deploy a play app you get a new timestamp which will force the users to download the file even though it might not have changed.

What do you guys think? Is it worth using this approach in Play's Asset controller?

Thanks,

Drew

Drew Kutcharian

unread,
Apr 10, 2012, 11:57:53 AM4/10/12
to play-fr...@googlegroups.com
No response?

> --
> You received this message because you are subscribed to the Google Groups "play-framework" group.
> To post to this group, send email to play-fr...@googlegroups.com.
> To unsubscribe from this group, send email to play-framewor...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/play-framework?hl=en.
>

Julien Tournay

unread,
Apr 10, 2012, 3:53:51 PM4/10/12
to play-fr...@googlegroups.com
Hi,

Play Assets controller is already sending etags, so I don't know if this would actually change anything.

jto.

Le mardi 10 avril 2012 08:57:53 UTC-7, Drew Kutcharian a écrit :
No response?

On Apr 9, 2012, at 11:48 PM, Drew Kutcharian <dr...@venarc.com> wrote:

> Hi All,
>
> I've been thinking about generating a class using sbt that contains a map that contains the md5 hashes of contents of each asset file (css or js) keyed by the path+name of the asset during build time.
>
> I think this should be effective since the contents don't change after a build. This way I can append that hash as a url parameter to the end of the asset url and/or use it as an etag to optimize proxy and browser caching.
>
> I think this can be better than the Last Modified Time since effctively every time you deploy a play app you get a new timestamp which will force the users to download the file even though it might not have changed.
>
> What do you guys think? Is it worth using this approach in Play's Asset controller?
>
> Thanks,
>
> Drew
>
> --
> You received this message because you are subscribed to the Google Groups "play-framework" group.

> To post to this group, send email to play-framework@googlegroups.com.
> To unsubscribe from this group, send email to play-framework+unsubscribe@googlegroups.com.

Eric Jain

unread,
Apr 10, 2012, 4:30:38 PM4/10/12
to play-framework
On Apr 9, 11:48 pm, Drew Kutcharian <d...@venarc.com> wrote:
> What do you guys think? Is it worth using this approach in Play's Asset controller?

+1

I just happen to need the same thing, created a ticket here:
https://play.lighthouseapp.com/projects/82401-play-20/tickets/340-option-to-append-checksum-to-asset-links

Eric Jain

unread,
Apr 10, 2012, 4:41:20 PM4/10/12
to play-framework
On Apr 10, 12:53 pm, Julien Tournay <boudhe...@gmail.com> wrote:
> Play Assets controller is already sending etags, so I don't know if this
> would actually change anything.

I believe Play's etags are based on the last modification date, and
could therefore change with each redeployment?

Even if the etags were checksum-based etags, clients would still need
to send a request for each resource, to check if the resource has been
changed. With revved filenames, you can set a large max-age to avoid
these unnecessary requests.

Not a big deal for small intranet applications, but something you
might expect to find in a framework that promises a "web-friendly
architecture [...] for highly-scalable applications"...

Drew Kutcharian

unread,
Apr 10, 2012, 4:41:51 PM4/10/12
to play-fr...@googlegroups.com
OK, the main issue is that pretty much none of the reverse proxies out there support caching/invalidating caches based on etags out of the box AFAIK (there are some 3rd party modules for Nginx, but they are incomplete). So if you wanted your front-end web server to cache resources so it won't hit Assets controller every time, it can't be done using etags. That leaves us to falling back to good old query parameters like: 


As you can see, this becomes a real pain to handle if not done automatically during the build time. So preferably it would be great to have Assets controller be aware of the hashes of the assets at build time (since assets don't change once distributed) and then when generating the urls using reverse routes, append the hash that was calculated at the time of build to the url.

In addition, currently Assets controller uses a mutable ConcurrentHashMap to keep track of all the etags and last modified times which can be turned into immutable maps if the values are calculated at built time.

-- Drew


To view this discussion on the web visit https://groups.google.com/d/msg/play-framework/-/18AC0VINw9IJ.
To post to this group, send email to play-fr...@googlegroups.com.
To unsubscribe from this group, send email to play-framewor...@googlegroups.com.

Julien Tournay

unread,
Apr 10, 2012, 5:08:23 PM4/10/12
to play-fr...@googlegroups.com
For high traffic applications you should use CDNs for assets anyway.

Drew Kutcharian

unread,
Apr 10, 2012, 5:17:47 PM4/10/12
to play-fr...@googlegroups.com
Yes, we are using CDNs and that's how the issue came up ;)


--
You received this message because you are subscribed to the Google Groups "play-framework" group.
To view this discussion on the web visit https://groups.google.com/d/msg/play-framework/-/jKPlYUOtwjAJ.
To post to this group, send email to play-fr...@googlegroups.com.
To unsubscribe from this group, send email to play-framewor...@googlegroups.com.

Julien Tournay

unread,
Apr 10, 2012, 5:21:20 PM4/10/12
to play-fr...@googlegroups.com
Hum that's an interesting point. I didn't know that support for for etags was so poor.

Concerning the details of implementation, it's sounds more like something done at runtime to me.
The compiler could rename files, but then you'd have to use that hash for reverse routing.

I think you need to hack the reverse router compiler to add the extra param and the Assets controller. (Maybe juste the reverse router, Asset could just ignore the extra param). Basically hashes would be computed onApplicationStart and used by the reverse router .

jto.

Drew Kutcharian

unread,
Apr 10, 2012, 5:26:17 PM4/10/12
to play-fr...@googlegroups.com
No, there's no need to rename the files, that's why I suggested to append the hash a query parameter to the end of the url. The file name stays the same.

-- Drew


To view this discussion on the web visit https://groups.google.com/d/msg/play-framework/-/ylR28bBAuZYJ.
To post to this group, send email to play-fr...@googlegroups.com.
To unsubscribe from this group, send email to play-framewor...@googlegroups.com.

Eric Jain

unread,
Apr 10, 2012, 5:27:15 PM4/10/12
to play-fr...@googlegroups.com
On Tue, Apr 10, 2012 at 14:08, Julien Tournay <boud...@gmail.com> wrote:
> For high traffic applications you should use CDNs for assets anyway.

Even if I can force the CDN to cache the assets without revalidating,
clients still end up sending unnecessary requests to the CDN.

Drew Kutcharian

unread,
Apr 10, 2012, 5:29:46 PM4/10/12
to play-fr...@googlegroups.com
@Eric,

I added the link to this conversation to the lighthouse ticket. BTW, the lighthouse ticket looks incomplete to me, maybe something happened during posting?

https://play.lighthouseapp.com/projects/82401-play-20/tickets/340-option-to-append-checksum-to-asset-links

-- Drew

> --
> You received this message because you are subscribed to the Google Groups "play-framework" group.

> To post to this group, send email to play-fr...@googlegroups.com.
> To unsubscribe from this group, send email to play-framewor...@googlegroups.com.

Julien Tournay

unread,
Apr 10, 2012, 8:52:34 PM4/10/12
to play-fr...@googlegroups.com
My point is that it's not something you will solve with sbt. It's runtime.

jto

Drew Kutcharian

unread,
Apr 10, 2012, 9:02:29 PM4/10/12
to play-fr...@googlegroups.com
Why not?

To view this discussion on the web visit https://groups.google.com/d/msg/play-framework/-/iY0El86ulmwJ.
To post to this group, send email to play-fr...@googlegroups.com.
To unsubscribe from this group, send email to play-framewor...@googlegroups.com.

Julien Tournay

unread,
Apr 10, 2012, 9:08:25 PM4/10/12
to play-fr...@googlegroups.com
Sounds like a complicated to a simple problem no ?

Julien Tournay

unread,
Apr 10, 2012, 9:09:14 PM4/10/12
to play-fr...@googlegroups.com
+solution

Eric Jain

unread,
Apr 10, 2012, 9:09:15 PM4/10/12
to play-fr...@googlegroups.com
On Tue, Apr 10, 2012 at 18:02, Drew Kutcharian <dr...@venarc.com> wrote:
> Why not?

Doing it at runtime is simpler? (add some code or subclass the Asset
controller vs figuring out how to deal with sbt)

Julien Tournay

unread,
Apr 10, 2012, 9:15:55 PM4/10/12
to play-fr...@googlegroups.com
You don't have to change the Asset I think.
If the reverse router was just adding a query param to the path, the Asset would still work.


Would that work with reverse proxies ?

jto

Eric Jain

unread,
Apr 10, 2012, 9:35:20 PM4/10/12
to play-fr...@googlegroups.com
On Tue, Apr 10, 2012 at 18:15, Julien Tournay <boud...@gmail.com> wrote:
> You don't have to change the Asset I think.
> If the reverse router was just adding a query param to the path, the Asset
> would still work.

Yes; I was talking about the Asset controller because I thought
routes.Assets is generated from controllers.Assets?


> basically www.myapp.com/asset/images/foo.png?v1 instead
> of www.myapp.com/asset/images/foo.png
>
> Would that work with reverse proxies ?

Yes. But some proxies won't cache resources with a '?' [1], so it
might be a good idea to use foo.v1.png instead of foo.png?v1.

[1] https://developers.google.com/speed/docs/best-practices/caching#LeverageProxyCaching

Drew Kutcharian

unread,
Apr 10, 2012, 9:40:22 PM4/10/12
to play-fr...@googlegroups.com
Doing it runtime is definitely easier but at the cost of having a mutable state in a possibly hot area and risk of race conditions. (I'm not even sure if it can be done, how would you know the hash of a resource without first computing it?)

As far as proxies not caching resources with an '?', the URL can be changes to www.myapp.com/assset/images/foo.png/<md5(foo.png)>

-- Drew

> --
> You received this message because you are subscribed to the Google Groups "play-framework" group.

Eric Jain

unread,
Apr 11, 2012, 1:21:27 AM4/11/12
to play-fr...@googlegroups.com
On Tue, Apr 10, 2012 at 18:40, Drew Kutcharian <dr...@venarc.com> wrote:
> Doing it runtime is definitely easier but at the cost of having a mutable state in a possibly hot area and risk of race conditions. (I'm not even sure if it can be done, how would you know the hash of a resource without first computing it?)

Worst case the hash is calculated several times?

> As far as proxies not caching resources with an '?', the URL can be changes to www.myapp.com/assset/images/foo.png/<md5(foo.png)>

That should work, though I'd rather keep the file extension at the end.

Drew Kutcharian

unread,
Apr 11, 2012, 2:50:17 AM4/11/12
to play-fr...@googlegroups.com
I'm starting to think I'll be better off not using Play's public folder at all for static assets in production and just use a custom "hash aware" Assets controller for only the resources that are managed by Play such as LESS and Coffee/Javascript files. I'll just have to be careful when updating the static assets so their last modified times stay intact if they haven't changed and hope that Nginx will handle the rest. Thanks for all the great input guys.

cheers,

Drew

Julien Tournay

unread,
Apr 11, 2012, 3:06:05 AM4/11/12
to play-fr...@googlegroups.com
No, as I said you can do that on application start in your Global object. It definitely does not have to be mutable.

jto.


Le mardi 10 avril 2012 18:40:22 UTC-7, Drew Kutcharian a écrit :
Doing it runtime is definitely easier but at the cost of having a mutable state in a possibly hot area and risk of race conditions. (I'm not even sure if it can be done, how would you know the hash of a resource without first computing it?)

As far as proxies not caching resources with an '?', the URL can be changes to www.myapp.com/assset/images/foo.png/<md5(foo.png)>

-- Drew

On Apr 10, 2012, at 6:35 PM, Eric Jain wrote:

> On Tue, Apr 10, 2012 at 18:15, Julien Tournay <boud...@gmail.com> wrote:
>> You don't have to change the Asset I think.
>> If the reverse router was just adding a query param to the path, the Asset
>> would still work.
>
> Yes; I was talking about the Asset controller because I thought
> routes.Assets is generated from controllers.Assets?
>
>
>> basically www.myapp.com/asset/images/foo.png?v1 instead
>> of www.myapp.com/asset/images/foo.png
>>
>> Would that work with reverse proxies ?
>
> Yes. But some proxies won't cache resources with a '?' [1], so it
> might be a good idea to use foo.v1.png instead of foo.png?v1.
>
> [1] https://developers.google.com/speed/docs/best-practices/caching#LeverageProxyCaching
>
> --
> You received this message because you are subscribed to the Google Groups "play-framework" group.

> To post to this group, send email to play-framework@googlegroups.com.
> To unsubscribe from this group, send email to play-framework+unsubscribe@googlegroups.com.

Daithi O Crualaoich

unread,
May 2, 2012, 11:06:07 AM5/2/12
to play-fr...@googlegroups.com
Drew,

We are in a similar position and also not interested in taking a roundtrip to our CDN for an ETag freshness check. A complicating factor is that the site we are developing is to be comprised of multiple Play apps, potentially at different releases and which may or may not share the same asset versions. 

We concluded that we also want foo.<hash>.png style assets. We have our reasons:
 * We want to use a separate URL location outside of Play, our CDN, for hosting the statics.
 * This location is to be used by many apps.
 * Multiple revisions of assets must exist to facilitate rolling deployment and multiple applications.
 * We want the cache hit rate gain from our multiple component apps sharing the assets.
 * Because of CDN hosting, we want to rename the files and have them available at deployment, runtime is too late.

We have been down this route a bit already(still a work in progress) and making it work is a small chore. Briefly, we redefine `managedResources` to calculate file hashes and do the file copies. We also generate a properties file at this point which contains the renamings. This properties file is helpful in building deployment tasks to upload the assets to our CDN anyway but its primary purpose is to facilitate asset filename lookup in templates. e.g.:

    object Static {
      val base: String = ...
      private lazy val staticMappings: Map[String, String] = { ... load foo.png -> foo.<hash>.png mappings here ... }
 
      def apply(path: String) = base + staticMappings(path)
    }

We use this `Static` object in templates instead of `routes.Assets.at(<filename>)` lookup:

    <link rel="stylesheet" type="text/css" href="@Static("stylesheets/main.css")">

We make it additionally difficult for ourselves by including assets from library dependencies that have also been generated in this scheme. This facilitates factoring common assets for use by our multiple frontend applications but it complicates implementation in a number of ways, primarily around one-jarring.

Gotchas included:

 * We consider the hashing to be appropriate for a managed resource only. If the resource is unmanaged, i.e. in `/public`, it is your own problem. Consequently we add plain copy resource generators for images and CSS files in `app.assets` since they are not covered by existing Play resource generators.

 * Onejar clobbering of the property files containing the filename remappings. We include hash segments on these property files and search for them at runtime. We also error on remapping files from common dependencies so as to not cause surprise. 

 * It is not nice redefining `managedResources` for this but it is convenient. We want to remap after the closure compiler, less compiler, etc have run and before the build moves much further.

 * Referencing assets from other assets, i.e. from untemplated javascript files where the `Static` object is not available. Our present thinking on this is to add the necessary remaps to a javascript function in a template proper and use this function to generate asset locations in untemplated assets.

 * An object `Static` doesn't reload nicely in SBT when the common asset library dependency is changed.

 * We previously attempted some madness doing this in play-copy-assets. That was not the place for it.

Stable link to code containing our present implementation:

Follow along at:



Daithi

Reply all
Reply to author
Forward
0 new messages