I work on an app that has Django Channels 1 deployed in production with CPU-based auto-scaling. Most of the time, it works fine, but sometimes, when the site is busy and many users are utilizing the parts of our application that are channels-intensive, users are unable to utilize their websocket, and we need better metrics to help us determine the root cause fix.
- How do you monitor Django Channels in production?
- What are your favorite strategies for integrating with DataDog or New Relic?
- I'm considering writing some JS that detects a degraded websocket connection and posts a DataDog custom event (https://docs.datadoghq.com/api/v1/events/). What's a better way to accomplish the same thing?
Many thanks,
Myer