I have a web service that's allows users to supply a callback URL for receiving results on completion of the work.
What we want to do is to use each callback URL as a command key to allow us to break the circuit if the client's URL is extremely latent or starts throwing back exceptions to us.
To prevent malicious users from consuming all of our threads we want to use API tokens to implement a bulkhead using the semaphore isolation strategy to limit client's concurrent executions.
We initially tried to use Hystrix by following the examples in the documentation but had to revert it because we saw memory usage balloon significantly and ran out of memory after a few days.
Looking at the code I believe where we went wrong is using so many keys for our commands.
I found that each key is added to a static ConcurrentHashMap and does not appear to ever be removed for the lifetime of the application.
This leads to persisting all of the counters and state for a given key indefinitely instead of only for a short period of time.
Ideally the key and it's corresponding information would only persist until it's window has passed since its last activity at which point it could be evicted.
It seems like you could easily do this with a Guava cache if every command had the same windows and the storage wasn't statically initialized.
Is this a problem anyone else has encountered and found a good solution for?
Is this a configuration problem or am I using Hystrix wrong?
If I am using it wrong does anyone have any suggestions how we could achieve our goals with Hystrix some other way?