Alastair -
This is a great question. We face a very similar problem in our setup. We use Hystrix in both the Gateway (Zuul), and the layer underneath (which routes to multiple business-logic microservices). Hystrix essentially plays 2 different roles in the 2 systems.
In Zuul, it's very hard to produce a reasonable fallback, so it's utility is limited to metrics, timeouts, and some fairly loose resource-bounding via semaphores.
In the layer underneath (the API server), there are many calls for specific pieces of functionality. In these cases, it is possible to have good fallbacks. As a result, we have much stricter resource bounds (either via threadpool or semaphore), and regularly induce failures in order to test that fallbacks work as expected.