I was listening to Gil's podcast on SE Radio[1]. He mentioned that the circuit breaker is mostly to deal with slow responses and not errors. One of the things that occurred to me is that the decision whether to trip circuit breaker pattern should be based on the concurrency and not on timeout value. The concurrency value can better represent the response time behaviour of the API.
E.g.
if X is the timeout value and if we trip circuit breaker on 3 errors then concurrent requests for that resource would be:
N = 3 * X
However if the response time is X/2 then the same concurrency would be achieved by 6 consecutive requests and would probably have the same effect.
N = 6 * X/2
So ideally if we could identify broad buckets and calculate overall concurrency by adding concurrency of each bucket (because Little's law is additive) we would get an accurate picture.
This is just a exercise on paper and I've not used it anywhere. Any comments?
Shripad