Recommended configuration for Hystrix infront of a non-local cache

Johan Haleby

unread,

May 12, 2014, 10:22:58 AM5/12/14

to hystr...@googlegroups.com

Hi,

I'm planning to use Hystrix as a circuit breaker for requests to a distributed method cache. If the cache is not responding or is slow I simply want the requests to continue to the original source (i.e. skipping the cache). The way I've implemented it right now is using two Hystrix commands, one for getting elements out of the cache (command key "cache-get") and one for putting elements into the cache (command key "cache-put"). They both share the same Hystrix threadpool (group key). However I see some potential problems with this approach. If the cache is down (cache-get fails, maybe due to a circuit breaker being opened) and the request to the original source works as expected then its response is will be inserted into the cache (cache-put) which will likely also fail. Is it better to use the same Hystrix command (simply call it "cache" or something) instead of seperating them into get and put like this so that the circuit breaker kicks in for both "cache-get" and "cache-put" at the same time? The downside of this approach is that it won't be possible to distinguish between cache-get and cache-put in the Hystrix Dashboard. What's the recommended approach?

Regards,

/Johan

Ben Christensen

unread,

May 12, 2014, 8:24:35 PM5/12/14

to Johan Haleby, hystr...@googlegroups.com

This is a pretty common pattern at Netflix as well. A variant on it is shown here: https://github.com/Netflix/Hystrix/wiki/How-To-Use#fallback-cache-via-network except that you'll hit the cache first and then origin. I have no idea why I haven't documented the cache-then-origin pattern as we use it often.

We keep each source (cache and origin) in a separate command and separate thread-pool so that one cannot saturate the others resources and so metrics are available for both. We chain them using getFallback so regardless of how the first starts failing we fallback to the secondary. In other words, they are the same group but different command keys and thread pools.

The difference in yours is it sounds like a cache miss is expected rather than a failure (we consider it a failure as we eagerly populate the caches) - in that case you don't want to throw an exception but just conditionally invoke the second command from the first when it occurs - and also in getFallback when an error occurs fetching from cache (though under load if the cache is completely failed both commands will end up short-circuited since the origin likely won't be able to handle traffic if the caches are dead, so consider another local fallback strategy for that command if one is possible).

In short, your approach with two commands is correct, but separate the threadpools otherwise one going bad can cause the other to go bad (reject once saturated) as well - unless that's what you want of course.

Ben Christensen

310.782.5511

@benjchristensen

--
You received this message because you are subscribed to the Google Groups "HystrixOSS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hystrixoss+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Johan Haleby

unread,

May 19, 2014, 12:48:00 AM5/19/14

to hystr...@googlegroups.com, Johan Haleby

Hi Ben,

Thanks for your answer. However my problem is a little different. To be more precise I'm developing a library based on Spring Cache that can be reused in many applications. For example:

@Cacheable(cache="cacheName", key = "#y")

public X x(String y) { .. }

It's inside the "cacheable implementation" that's triggered by the @Cacheable annotation where I want to use Hystrix as a circuit breaker around the call to the cache. I.e. x -> Hystrix -> cache. The reason for this is to make it transparent to the end user so that they don't have to setup and use a hystrix command explicitly for each call to the cache. So I cannot have a fallback logic inside the cache implementation that calls the original method (x) since I don't know what the original method is (this is not provided by Spring afaik). So what I do now is to return null as fallback strategy since then Spring knows that it should invoke the original method (actually null is also returned as a cache miss, and perhaps that's the root of the problem). The problem is that after the x() method returns Spring will try to put the result in the cache. But if the cache is down or is slow when doing the get request (resulting in the Hystrix fallback kicking in and returning null) then it's quite probable that the put request will also fail.

What approach would you recommend here? Or do you think it's bad practise to "hide" Hystrix in a library like this? Note that when I'm talking about "Hystrix" in this example it's for the cache only. My intention is that there will be another Hystrix command guarding x (with a different thread-pool) that the client must be more explicit about (but that's not my concern here).

Regards,

/Johan

Johan Haleby

unread,

May 19, 2014, 1:15:01 AM5/19/14

to hystr...@googlegroups.com, Johan Haleby

One idea to solve the problem would be to check if the circuit breaker really is closed before calling the cache write logic. But could this affect the way Hystrix works since in that case we won't call the run method at all?