This question is regarding:
https://github.com/envoyproxy/ratelimitWe are currently running a single ratelimit server and many clients sending/receiving data over gRPC (about 30-40 clients to 1 server).
The issue we are seeing, is that the server can't keep up with the request rate from the clients. gRPC requests end up getting queue'd up on the server with a blacklog that can grow up to 30mins behind i.e the client is sending request data from 30min ago.
We're running the ratelimit server in k8s, so I'm wondering if I can run multiple of them at once to handle the load/request rates.
The set-up would be as follows:
(30-40) clients -> k8s load balancer -> (2-4) servers handling requests -> redis backend
Looking at the code, I don't think this should be an issue, specifically when disabling cache so that everything goes to redis (and we are doing an atomic increment on the cache key in redis)
disabling cache via:
```
LOCAL_CACHE_SIZE_IN_BYTES="0"
```
Was wondering if anyone knows if it is safe to do so, or if there might be some concurrency issues.
Also, would be curious, how do folks usually run the server for large environments with a large request rate/sec.