Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

We fixed MDN performance/downtime

4 views
Skip to first unread message

Luke Crouch

unread,
Apr 18, 2014, 12:44:48 PM4/18/14
to dev...@lists.mozilla.org, dev-mdc
We had intermittent slow-downs and down-time on MDN over the last 12-24
hours. [1]

We checked thru our latest code changes for any obvious offenses and
found nothing. [2]

We looked at the most recent chief deployment log, and found 1 of our 3
web servers didn't gracefully restart Apache. [3]

So, we had a web-head with 78 zombie processes serving 1/3 of MDN
requests. :(

solarce from WebOps stopped Apache on the box, killed the zombie
processes, and started Apache back up.

New Relic has stopped sending alerts, and is reporting green on Appdex
and server response times again.

Thanks davidwalsh, openjck, solarce, and cyliang for the help!

-L

[1] https://rpm.newrelic.com/accounts/263620/incidents/8540135
[2]
https://github.com/mozilla/kuma/compare/e9f335d2e688d83aa532bbdb69c43cae4826f043...3791b61bdec350387774a2c53b6ff43a0179fdba
[3] https://pastebin.mozilla.org/4848096

--
Q: Why is this email five sentences or less?
A: http://five.sentenc.es

0 new messages