In order to make the request Burst resistant, the Token Bucket is specified as follows.
```yaml
name: envoy.filters.http.local_ratelimit
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
stat_prefix: http_local_rate_limiter
token_bucket:
max_tokens: 200
tokens_per_fill: 20
fill_interval: 1s
filter_enabled:
runtime_key: local_rate_limit_enabled
default_value:
numerator: 100
denominator: HUNDRED
filter_enforced:
runtime_key: local_rate_limit_enforced
default_value:
numerator: 100
denominator: HUNDRED
response_headers_to_add:
- append: false
header:
key: x-local-rate-limit
value: "true"```
This means that a maximum of 200 rps can be accepted for one second, and then every second, the Token will recover by 20 to prepare for the next Burst.
However, what I want to achieve now is not to have these 200rps reach the upstream server immediately, but to allow some delay and send the request to the upstream server. Specifically, I'm hoping for something equivalent to the delay option in nginx's limit_req directive.
Is there any richness in Envoy to specify this delay feature? If not, will this be implemented in the future?