Thanks Tyler. I appreciate the follow-up.
I don't have any queries in my app - it's always pulling keys directly.
It was queued up writes once the outage was resolved. Probably due to the massive load of everyone trying to write at the same time, each op reported a massive latency.
From my perspective, this metric isn't very helpful as is. Unless I notice this spike w/i 24 hours, there's no way to dig into and try and correlate the cause of a performance issue. I'd love to see this metric at a more granular level, allow for alerts, and overlay outages / annotations. This would allow me to correlate performance issues w/ deployments, marketing campaigns, outages etc and take real action to resolve.
Keep up the awesome work! I've seen massive improvements over the last year. I'm very excited for what's next!
Drew