We've had our production application running problem free for 10 months, but in the past month or so we've started encountering "MySQL server has gone away" errors sporadically, but with increasing frequency. Currently it is happening every few days, sometimes several times a day. This has never happened to the app in test or development environments.
These factors lead me to believe this problem is manifesting itself as a result of increased activity. Still, our loads are relatively low. We may have a handful of users connected at any one time, but not more than 10.
Looking at our Newrelic database reports, we're seeing that the a huge spike in db clock time always coincides with the errors, and the culprit is always the web2py_session_ UPDATE call. (It may possibly correlated with new user signups, but we haven't been able to confirm this)
Stranger still, in one of these incidents there were over a hundred calls made to web2py_session_{app} UPDATE, many times the number of total requests made to the app over the same period.
Traces:
Traceback (most recent call last):
File "/opt/web-apps/web2py/gluon/main.py", line 624, in wsgibase
BaseAdapter.close_all_instances('rollback')
File "/opt/web-apps/web2py/gluon/dal.py", line 529, in close_all_instances
db._adapter.close(action)
File "/opt/web-apps/web2py/gluon/dal.py", line 509, in close
getattr(self, action)()
File "/opt/web-apps/web2py/gluon/dal.py", line 1655, in rollback
if self.connection: return self.connection.rollback()
File "/usr/lib/python2.6/site-packages/newrelic-1.12.0.9/newrelic/hooks/database_dbapi2.py", line 76, in rollback
return self._nr_connection.rollback()
OperationalError: (2006, 'MySQL server has gone away')
and
Traceback (most recent call last):
File "/opt/web-apps/web2py/gluon/main.py", line 551, in wsgibase
session._try_store_in_db(request, response)
File "/opt/web-apps/web2py/gluon/globals.py", line 726, in _try_store_in_db
table._db(table.id == record_id).update(**dd)
File "/opt/web-apps/web2py/gluon/dal.py", line 8812, in update
ret = db._adapter.update(tablename,self.query,fields)
File "/opt/web-apps/web2py/gluon/dal.py", line 1372, in update
self.execute(sql)
File "/opt/web-apps/web2py/gluon/dal.py", line 1694, in execute
return self.log_execute(*a, **b)
File "/opt/web-apps/web2py/gluon/dal.py", line 1688, in log_execute
ret = self.cursor.execute(*a, **b)
File "/usr/lib/python2.6/site-packages/newrelic-1.12.0.9/newrelic/hooks/database_dbapi2.py", line 36, in execute
return self._nr_cursor.execute(sql, *args, **kwargs)
File "/usr/lib64/python2.6/site-packages/MySQLdb/cursors.py", line 173, in execute
self.errorhandler(self, exc, value)
File "/usr/lib64/python2.6/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
OperationalError: (1205, 'Lock wait timeout exceeded; try restarting transaction')
What could be the cause of all this? Any ideas welcome.