Silverstripe and Varnish / Static caching with Redis

781 views
Skip to first unread message

Klemen Novak

unread,
Apr 4, 2014, 2:25:14 PM4/4/14
to silverst...@googlegroups.com
Just curious if any of you are using any specific Varnish vcl scripts or SS configuration to optimize the Varnish-SS connection. I have a feeling SS sets cookies and ttl's in a way that may not be best, but I haven't looked into this a lot.

Also, has anyone considered doing static output caching into Redis? If I wanted a simple module, to do that just for a few specific controllers (e.g. Pages and DataObject view controller methods) - where is the output captured, in which class? 

Thirdly, I know that figuring out caching, still doesn't solve limitations on the Apache side - with the # of concurrent connections and memory spikes. What do you guys normally do to alleviate that? Is Varnish the ultimate solution?

Thanks for the input! Looking forward to your answers.

Klemen Novak

--
Klemen Novak
KINK Creative - Graphic and Web Design
www.kinkcreative.com
310.849.8931

Mateusz Uzdowski

unread,
Apr 6, 2014, 6:38:04 PM4/6/14
to silverst...@googlegroups.com, kle...@kinkcreative.com
Hi Klemen,

I don't think there is an ultimate solution for everyone, but one piece of advice is: keep it as simple as possible. A lot of it depends on how much traffic you have, and if you can actually cache your content (and how aggresively).

Regarding Varnish (and other transparent caches) there is some rudimentary support for a cache in front of a SilverStripe install via HTTP::set_cache_age() which might help alleviating your load. By default all responses come with cache-murdering headers, but if you set this to >0 it will add appropriate Cache-Control and Vary headers that Varnish should pick up. Otherwise it will only cache static files (assets).

One thing to be aware of is that you'll need to rewrite the User-Agent headers in Varnish to sort all the crazy User-Agent strings into couple of distinct classes, or remove them. Otherwise your Varnish will try to generate a cached version for every User-Agent in the world. If you are using responsive site and don't use agent-sensing in your PHP code you could also remove Vary: User-Agent completely (but there is no switch for that in core... ugh).

If you are intending to load-balance using varnish, you will need some kind of sticky sessions to still allow logins (otherwise it will round-robin between backends and you will have to log in on every request :-D ). I have something that does it, let me know and I can extract it for you.

For high-traffic sites you can use https://github.com/silverstripe-labs/silverstripe-staticpublishqueue/ to generate a html / php dump of the site, and then serve static html files. If the file is not available, it can fall back to dynamic PHP request (or you could block it). This is at the moment the only solution I know here that handles cache invalidation: if you publish, the html file gets updated. I'm not sure if there is anything that would make SS ping a cache (redis? varnish?) once new content is available. I'd be keen to know if there is anything :-)

So here are the things I'd probably try, in order of amount of work necessary:

- partial caching (<% cached %>) and other optimisations (i.e. find slowest URLs and fix these in code)
- if you have varnish in front, set HTTP::set_cache_age() to non-zero (remember about User-Agent Vary header - perhaps just pop it off in your VCL config!)
- add more resources to your server: e.g. if you are getting 1 interactive req per second regularly, maybe it makes sense to add second core and more RAM first.
- try staticpublishqueue module (this is easier than it seems, although you have to think up front what are the relationships between pages - i.e. what node updates what, what is publishable). This helps massively, unless you have a lot of interactive requests that are un-cacheable (e.g. form submissions)
- try Varnish load balancing and add more machines - if your load is on the Apache, not in the DB - as long as you keep one DB it's not going to be very hard (apart from the sticky session code where I can help you out).
- write your own caching system - if you plan aggresive caching (e.g. for hours) this would probably require writing some code to invalidate the cache when the content changes, and also some way of defining dependencies (staticpublisherqueue already has that: e.g. if a page changes, you might also want to update some menus on other nodes)

And then at the faaar end:
- horizontal scaling that includes the DB - pretty much any kind of partitioning or clustering where you have multiple DBs. It would probably help to get familiar with the CAP theorem. This blogger tested a lot of distributed systems and found out that with the default settings you start loosing quite a bit of data, maybe worth skimming too: http://aphyr.com/tags/jepsen .

m

Klemen Novak

unread,
Apr 7, 2014, 2:43:48 PM4/7/14
to Mateusz Uzdowski, silverst...@googlegroups.com
True, Mateusz! I've noticed two things happen - apache gets clogged as we hit 100+ users (on a 2Gb Linode VPS), there are memory and CPU spikes and then the site becomes unavailable.

Static publishing looks really great, although I'm curious if it actually can take account custom controllers. E.g. there's numerous controllers which have a standard /myaction/URLSegment approach, together with their index page as a listing of paginated dataobjects. Those need to be delivered really fast. Partial caching did speed up the loading time a bit, but still in a bottleneck, the server would stall.

Both sites are 2.4.x, so I am not sure how SS3 behaves - haven't had a high-impact load on any of my SS3 sites yet so I can't say.

I have Varnish setup, although my pages still show as "Yes! Sort of!" on the varnish load check (http://www.isvarnishworking.com). It's funny because the max-age is now set to 30, and the top section (not the header itself but the "analysis") still says it's 0. I am not really sure where to reset the Vary: User-agent header - I don't really need it, the theme is responsive. I looked around but haven't been able to find a solution. Would you mind sharing a link?

What if we actually cached into Redis inside the controller or JUST at the SSiewer output, based on URL parameters (eg actions/ids/sub-ids)? Feeds, I normally just cache into APC from controllers, and simply echo, but obviously they do not require theme parsing, so I'm not sure where I'd start looking into -that-.

Redis would probably be pretty cool for that due to its structure, quick look-ups based on ID / hash (=action/URLSegment, for example), and it would just spit out the stored HTML. The structures in Redis can have an expiration set from within Redis so if it's not there, it would re-parse and store.

Thanks for the great insight,

Klemen 

Klemen Novak
KINK Creative - Graphic and Web Design
www.kinkcreative.com
310.849.8931


Mateusz Uzdowski

unread,
Apr 8, 2014, 10:36:56 PM4/8/14
to silverst...@googlegroups.com, Mateusz Uzdowski, kle...@kinkcreative.com
From hearsay evidence, 3.x is slower than 2.4 because of the config system, so you will need to take that into account if you want to upgrade.

100+ users in parallel hitting interactive URLs? That sounds... excellent :-) Unless it's per minute? Not sure about how staticpublishqueue would work with 2.4, you might need to use an older branch, or the older module which was called "staticpulisher".

The publishing is done via URLs, the underlying structure doesn't matter. The actual creation of static cache file happens through Director::test("your-url/2/whatever/iso.bin"), so as long as it's a valid controller URL, it will come out allright.

Specifically, with staticpublishqueue you can customise the URLs you wanted to publish per publishable object, there is no limit to quantity. You just need to provide your own urlsToCache method on your "StaticallyPublishable" object, whatever the object is (either through inheritance, or interface implementation) - here is an example on how default is set up: https://github.com/silverstripe-labs/silverstripe-staticpublishqueue/blob/master/code/extensions/publishable/PublishableSiteTree.php#L81 , but you can write your own too.

There is a similar approach in the old staticpublisher for that (probably also called urlsToCache, this one is based on inheritance only). But it's certainly possible.

For example if you had a paginated page you'd provide a whole set of URLs: mypage, mypage/1, mypage/2, mypage/3 (in a loop), and it would produce cache files for you: cache/mypage.html, cache/mypage/1.html, cache/mypage/2.html and so on. Then .htaccess or nginx config would rewrite the requests appropriately (and if not found, fall back to dynamic request if you wish - this is all set up on the webserver side).

Re user-agent header, try that: https://www.varnish-cache.org/trac/wiki/VCLExampleFixupVary . The varnish checker might be picking on that header...

m

Klemen Novak

unread,
Apr 8, 2014, 10:56:59 PM4/8/14
to Mateusz Uzdowski, silverst...@googlegroups.com, kle...@kinkcreative.com
That sounds cool - especially because Varnish would do great with static files. Wondering if static publisher could write to Redis instead to play with it and see if it performs any faster? I'm by no means a developer of that level but I might give it a taste.

And thank you for the link!

Sent from my iPhone

Klemen Novak

unread,
Apr 8, 2014, 11:07:14 PM4/8/14
to Mateusz Uzdowski, silverst...@googlegroups.com, kle...@kinkcreative.com
Oh and for the visitors - it's a the same time... we had at one point 800 concurrent connections on the server. This one required an upgrade to a 16gb Linode and shutting keepalive off (it's a trick with these timed releases of content when EVERYONE hits the server at a single time), but that was before I was able to install Varnish. Since Varnish it's been pretty smooth although the node is a beast and I wonder after the spikes are over, how far I could downgrade without jeopardizing performance (2gb? 4gb?)

Klemen Novak
KINK Creative - Graphic and Web Design
www.kinkcreative.com
310.849.8931


vikas srivastava

unread,
Apr 9, 2014, 12:55:31 AM4/9/14
to silverst...@googlegroups.com, Mateusz Uzdowski, kle...@kinkcreative.com
Hello Mateusz !  
If you are intending to load-balance using varnish, you will need some kind of sticky sessions to still allow logins (otherwise it will rouee there nd-robin between backends and you will have to log in on every request :-D ). I have something that does it, let me know and I can extract it for you.  

 I am facing this same problem while trying to setup Varnish with load-balance. Please can you share how did you resolve this. Also what will be your choice between Varnish and Static Publish Queue. I see there are many challenges with Varnish when using with SilverStripe, Like maintaining user sessions, Forms with security tokens etc. It would be real nice if you can guide us through solutions you choose to overcome these issues :)

thanks and regards

Vikas Srivastava
Software Engineer | Moonpeak Media

Skype: viky2130
twitter: @openbeesV


--
You received this message because you are subscribed to the Google Groups "SilverStripe Core Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to silverstripe-d...@googlegroups.com.
To post to this group, send email to silverst...@googlegroups.com.
Visit this group at http://groups.google.com/group/silverstripe-dev.
For more options, visit https://groups.google.com/d/optout.

Mateusz Uzdowski

unread,
Apr 9, 2014, 5:43:01 PM4/9/14
to silverst...@googlegroups.com, Mateusz Uzdowski, kle...@kinkcreative.com
@Klemen ah OK, I see these Linode servers have 8 cores. To satisfy my own curiosity, do you know what are your page load times when there is only 1 user hitting your site, and what are they if there is 800? From a rough calculation even if your dynamic site responds within say 200ms, you still can only serve 40 requests per second - that'd still mean 20s wait time for a page in the spike?

I don't know how http://www.isvarnishworking.com is supposed to work, but if you login to your varnish box and do `varnishhist', you can see clearly how many requests are cached and how many are not - pipe | is cache hit, hash # is cache miss. Also for debugging varnishtop can do quite elaborate filtering of data (e.g. varnishtop -i TxUrl to get the most frequently fetched URLs), varnishncsa gives you access to apache.access.log-like output from memory (https://www.varnish-software.com/static/book/Appendix_A__Varnish_Programs.html)

@Vikas, a disclosure here, I don't have so much experience with running elaborate customised configs of varnish in prod, I just have some experience and observations from running vanilla varnish with small tweaks. I have more experience with the static publisher.

With static publisher, the response time gains on the server are rather massive, in order of hundreds or thousands (serving static file is under 1ms, dynamic is hundreds ms). Another opportunity here is that you can publish that static cache files to another machine, and have that act as a webserver - that gives you strong improvement in security (although you can't serve any dynamic stuff anymore). And then it's all PHP and static files, so you can actually debug the thing.

Static publisher bugs can be hard to pinpoint, to give you a taste, I have couple of years ago hit an issue with the homepage going missing from time to time. It turned out it was caused by an overnight job hitting an external RedirectorPage, which, from what I remember, resulted in a trimmed "" URL to which an empty string was written (because it couldn't fetch external URLs) - and "" URL happens to be a home page...

Varnish does a cache miss immediately if it discovers a cookie, so if you hit a page with a CSRF token varnish will not cache that. I think the default config is pretty safe in that respect, and you will find yourself cursing that so much code in SS tries to start a session :-) As soon as you have that, varnish is off limits.

Regarding "sticky sessions", this required compiling the header vmod (because whole world violates - kind of - the RFC and sends multiple cookie headers on multiple lines...). If you don't use Session::cookie_secure though, you could cleverly append the current backend name to the (only) session Cookie and avoid the whole vmod debacle - it's a bit of a hack, but chuck the backend name into PHPSESSID somewhere in the vcl on the way out, and strip it out also in vcl when the request comes back.

Otherwise, here is the code I currently have: https://gist.github.com/mateusz/2662a7d0b7a183503d9c . It's tested and it works, but it's not really heavily used at the moment so I'd be keen to hear any feedback if you find any issues ;-)

m

Mateusz Uzdowski

unread,
May 12, 2014, 12:16:17 AM5/12/14
to silverst...@googlegroups.com, kle...@kinkcreative.com
Guys, I'm just looking for a quick feedback on the direction of a module. Am trying to make the cache headers more configurable (on the account of not being able to change the Vary header which shouldn't always apply). 


The goal is to have an easy way to apply per-controller caching policies. Currently the module above allows you to simply do the following:

Injector:
  MyCachingPolicy:
    class: CachingPolicy
    properties:
      cacheAge: 300
      vary: 'Cookie, X-Forwarded-Protocol, Accept'
HomePage_Controller:
  dependencies:
    Policies: '%$MyCachingPolicy'

You can see that we have just configured customised vary headers and 300s max-age just for the HomePage_Controller.

The "CachingPolicy" supplied with the module reimplements the global hardcoded HTTP::add_cache_headers in a slightly ugly way, but should be backwards compatible. If it works out I'd love to remove the HTTP::add_cache_headers in the next version of SilverStripe.

Thanks to @stojg for supplying the CustomHeaderPolicy which allows you to add extra headers without effort (also bundled now):

  CustomPolicy:
     class: CustomHeaderPolicy
     properties:
       headers:
         Custom-Header: "Hello"

And thanks to @tractorcow for all the discussion and inspiration on how to handle this issue (last time I saw him, @tractorcow was trying to think how could we implement rate limiting policy using the same approach).

What d'you think? Yee? Nah?

Mateusz


On Saturday, April 5, 2014 7:25:14 AM UTC+13, Klemen Novak wrote:

Mateusz Uzdowski

unread,
Jul 10, 2014, 12:25:09 AM7/10/14
to silverst...@googlegroups.com
If you ever had a problem with the Vary header, I've added a handy table on the controllerpolicy module (now moved to silverstripe-labs): https://github.com/silverstripe-labs/silverstripe-controllerpolicy . It also now uses a new, better default of 'Cookie, X-Forwarded-Protocol' which will make the responses much more cache-able.

Don't forget to pop off these pesky frontend-only cookies on Varnish (__utma, __utmb, _ga and whatever else).

m

Conrad Dobbs

unread,
Jul 10, 2014, 4:32:42 PM7/10/14
to silverst...@googlegroups.com, kle...@kinkcreative.com
If looking to use Redis for caching, someone has already done a backend for Zend Cache, https://github.com/colinmollenhour/Cm_Cache_Backend_Redis. Redis can be used as a slow backend as it supports tags.

Anselm Christophersen

unread,
Jul 12, 2014, 1:28:00 PM7/12/14
to silverst...@googlegroups.com
I love it.
I did a test-case with Varnish appr. 2 years ago, and did exactly run into those header-problems.
I did some custom config tweaking, but eventually I abandoned it, as I had the feeling it was too error prone.
Do you use it with Varnish? And, are you using a custom Varnish configuration?

Anselm


Mateusz Uzdowski

unread,
Jul 13, 2014, 4:59:54 PM7/13/14
to silverst...@googlegroups.com
A custom varnish configuration is pretty much granted with the way VCL works. The only tweak in the VCL is to pop off frontend cookies (_ga, __utma, __utmb and the likes), which is not a big job to do. In SilverStripe you need to make sure the session doesn't start unnecessarily - there was a bug in 3.1 recently that caused the session to start on every request.

If you have completely generic controllers, without any personalised content, you can do a smaller Vary header - but the Cookie will still prevent any caching from happening because of https://www.varnish-software.com/static/book/VCL_Basics.html#default-vcl-recv - in that case you'd need to make your own vcl_recv and subvert that Cookie check.

And THEN your content will be cached ;-) Oh well that was a bit off-topic wasn't it.

m


--
You received this message because you are subscribed to a topic in the Google Groups "SilverStripe Core Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/silverstripe-dev/FSFbLFvmr3Y/unsubscribe.
To unsubscribe from this group and all its topics, send an email to silverstripe-d...@googlegroups.com.

To post to this group, send email to silverst...@googlegroups.com.
Visit this group at http://groups.google.com/group/silverstripe-dev.
For more options, visit https://groups.google.com/d/optout.



--

Mateusz Uzdowski | Developer
SilverStripe
http://silverstripe.com/

Phone: +64 4 978 7330 xtn 68
Skype: MateuszUzdowski

Reply all
Reply to author
Forward
0 new messages