Hi,
We recently changed to django 2.x and channels. 2.2 from a system that has been running relatively stable on channels 1.x for almost 3 years now. Our tests did not show any issues when we were running django channels in our test servers. And even when we tried load testing with frequent requests, the errors were not unexpected i.e. there would be conenctions dropped and a few 502s. ANd as soon as the load was cut off things would work fine.
This changed drastically when we deployed it in our prod environment. Our daphne process SIGKILLS every few minutes and as a result our websockets connections are extremely unstable and requires frequent reconnects. Our setup is such that we have an AWS ALB sitting in front of NGINX that directs all non-websockets traffic to a gunicorn process (which is pretty stable) and all websockets traffic to a daphne process.
We tried Uvicorn in the middle as well, howeer Uvicorn wouldn't sigkill itself but it was just as unresponsive.
Our ELB logs show a steady stream of 502 errors.
We have been dealing with these issues for around 2 days now and are completely lost. If you coud give some advice on where we should be looking at. What could be causing SIGKILLS, that would be seriously awesome.