Hello there,
regarding throttling based on Bucket4j in CAS, we have recently switched bucket's refill strategy
from INTERVALLY to GREEDY and we have been quite surprised with the results when testing - it just didn't match what the CAS documentation says. I would like to share our findings here and know your take on this.
We have tested this specifically for
CAS Simple MFA Rate Limiting, but the problem found can be applied to other "capacity throttling" usages in CAS as well. So, based on the documentation, we firstly had a configuration like this:
cas.authn.mfa.simple.bucket4j.bandwidth[0].capacity=6
cas.authn.mfa.simple.bucket4j.bandwidth[0].duration=PT1M
cas.authn.mfa.simple.bucket4j.bandwidth[0].refill-count=3
cas.authn.mfa.simple.bucket4j.bandwidth[0].refill-duration=PT1M
cas.authn.mfa.simple.bucket4j.bandwidth[0].refill-strategy=INTERVALLY
This seemed to work just fine. Then we decided to switch the refill strategy. So we changed the corresponding line to:
cas.authn.mfa.simple.bucket4j.bandwidth[0].refill-strategy=GREEDY
According to both CAS and Bucket4j docs, the only difference should be in the speed in which tokens are refilled - GREEDY means as soon as possible (1 token per 20 seconds), while INTERVALLY means once in the set period (3 tokens after 1 minute). But to our surprise, when doing the tests again (now with GREEDY), starting from a complete bucket depletion, users could now send 2 times more messages in the testing period of time, before being blocked.
----
After a thorough investigation, we have found the following.
The discrepancy of CAS documentation vs. reality seems to be there since the introduction of "advanced rate-limiting config for bucket4j" in CAS v6.5.0-RC5. What we think is the problem is
this code in the InMemoryBucketStore class (shortened here for better readability - the call to "
.withInitialTokens()" is done again after the
switch anyway):
var limit = switch (bandwidth.getRefillStrategy()) {
case INTERVALLY -> Bandwidth.classic(bandwidth.getCapacity(), Refill.intervally(bandwidth.getRefillCount(),
Beans.newDuration(bandwidth.getRefillDuration())));
case GREEDY -> Bandwidth.simple(bandwidth.getCapacity(), Beans.newDuration(bandwidth.getDuration()));
};
As one can see, quite different CAS configuration properties and Bucket4j API methods are used here for the creation of Bandwidth and its Refill. We believe the code above should be basically
changed to:
var limit = switch (bandwidth.getRefillStrategy()) {
case INTERVALLY -> Bandwidth.classic(bandwidth.getCapacity(), Refill.intervally(bandwidth.getRefillCount(),
Beans.newDuration(bandwidth.getRefillDuration())));
case GREEDY ->
Bandwidth.classic(bandwidth.getCapacity(), Refill.greedy(bandwidth.getRefillCount(),
Beans.newDuration(bandwidth.getRefillDuration())));
};
Another logical step is to remove the now superfluous
bandwidth[0].duration property, as apparently Bucket4j lets you define the speed of refill, but not the speed of tokens' consumption (when full, the whole bucket can be exhausted practically immediately).
----
For now, we have simply switched back to INTERVALLY, achieving approximately the same desired results with this altered configuration (while there are surely other ways, this is the most readable for us):
cas.authn.mfa.simple.bucket4j.bandwidth[0].capacity=6
cas.authn.mfa.simple.bucket4j.bandwidth[0].refill-count=1
cas.authn.mfa.simple.bucket4j.bandwidth[0].refill-duration=PT30S
cas.authn.mfa.simple.bucket4j.bandwidth[0].refill-strategy=INTERVALLY
Best regards
Petr