I have a backend service which exposes a single endpoint to perform some database updates. Each request is expected to take between 5-25 seconds on average. Apache/mod_wsgi and the flask app sit on 2 Elastic Beanstalk instances with 2 CPUs and 8GB ram each. I want to limit the traffic send to the app to 10-20 requests at any given time per instance. Any additional requests should queue in the backlog and if not picked up within the configured timeout they should be dropped until repeated later.
For some reason the number of requests per process goes really high, and so the average latency starts to go up and failure rates increase too. I setup recording of `mod_wsgi.process_metrics()["request_count"]` stats and I verified it goes from 0 to >180, and `mod_wsgi.process_metrics()["threads"]["request_count"]` ranging from 24-65 within less than 1 hour. I tried reducing Apache's `MaxRequestWorkers` and `ListenBackLog`. Also WSGIDaemonProcess' `queue-timeout` and `listen-backlog` but no luck. It's important to not accept a request when max capacity is reached and then timeout halfway while processing because all threads were too busy. This is a worker service so it's fine to drop the request as it will be retried later.
I'm using Apache 2.4.39 and mod_wsgi 4.6.5. Apache is configured as follows:
<IfModule reqtimeout_module>
RequestReadTimeout header=15,MinRate=500 body=150,MinRate=500
</IfModule>
TimeOut 150
KeepAlive Off
KeepAliveTimeOut 0
<IfModule mpm_worker_module>
StartServers 1
ServerLimit 2
MinSpareThreads 5
MaxSpareThreads 10
ThreadLimit 5
ThreadsPerChild 5
ListenBackLog 10
MaxRequestWorkers 10
MaxConnectionsPerChild 5000
</IfModule>
mod_wsgi config:
WSGIDaemonProcess wsgi \
processes=10 \
threads=5 \
display-name=%{GROUP} \
python-home=/opt/python/run/venv/ \
python-path=/opt/python/current/app user=wsgi group=wsgi \
home=/opt/python/current/app \
lang='en_US.UTF-8' \
locale='en_US.UTF-8' \
connect-timeout=15 \
socket-timeout=25 \
request-timeout=25 \
deadlock-timeout=120 \
graceful-timeout=120 \
restart-interval=0 \
maximum-requests=200 \
queue-timeout=15 \
listen-backlog=10
WSGIProcessGroup wsgi
WSGIApplicationGroup %{GLOBAL}
Any ideas what might do the trick?