Appscale Apps stopped working, Dashboard error "We're sorry, but something went wrong."

25 views
Skip to first unread message

xybrek

unread,
Jun 1, 2016, 5:42:08 AM6/1/16
to AppScale Community
Few hours ago our Appscale server is working fine, earlier it stopped working, checking the 'app___appscaledashboard-8000.log' file I can see Datastore connection error. 

- What is the root cause of this error?
- How can we prevent this error from happening again?
- What are the ways to make Appscale more stable?

Here is the log:

ERROR    2016-06-01 09:26:01,520 datastore_distributed.py:358] Datastore connection error on get.
WARNING  2016-06-01 09:26:01,521 tasklets.py:409] suspended generator _get_tasklet(context.py:329) raised InternalError(Datastore connection error on get.)
WARNING  2016-06-01 09:26:01,521 tasklets.py:409] suspended generator get(context.py:744) raised InternalError(Datastore connection error on get.)
ERROR    2016-06-01 09:26:01,522 app_dashboard_data.py:580] Datastore connection error on get.
Traceback (most recent call last):
  File "/var/apps/appscaledashboard/app/lib/app_dashboard_data.py", line 551, in update_users
    user_info = self.get_by_id(UserInfo, email)
  File "/var/apps/appscaledashboard/app/lib/app_dashboard_data.py", line 188, in get_by_id
    return model.get_by_id(key_name)
  File "/root/appscale/AppServer/google/appengine/ext/ndb/utils.py", line 142, in positional_wrapper
    return wrapped(*args, **kwds)
  File "/root/appscale/AppServer/google/appengine/ext/ndb/model.py", line 3530, in _get_by_id
    return cls._get_by_id_async(id, parent=parent, **ctx_options).get_result()
  File "/root/appscale/AppServer/google/appengine/ext/ndb/tasklets.py", line 325, in get_result
    self.check_success()
  File "/root/appscale/AppServer/google/appengine/ext/ndb/tasklets.py", line 368, in _help_tasklet_along
    value = gen.throw(exc.__class__, exc, tb)
  File "/root/appscale/AppServer/google/appengine/ext/ndb/context.py", line 744, in get
    entity = yield self._get_batcher.add_once(key, options)
  File "/root/appscale/AppServer/google/appengine/ext/ndb/tasklets.py", line 368, in _help_tasklet_along
    value = gen.throw(exc.__class__, exc, tb)
  File "/root/appscale/AppServer/google/appengine/ext/ndb/context.py", line 329, in _get_tasklet
    entities = yield self._conn.async_get(options, datastore_keys)
  File "/root/appscale/AppServer/google/appengine/ext/ndb/tasklets.py", line 454, in _on_rpc_completion
    result = rpc.get_result()
  File "/root/appscale/AppServer/google/appengine/api/apiproxy_stub_map.py", line 615, in get_result
    return self.__get_result_hook(self)
  File "/root/appscale/AppServer/google/appengine/datastore/datastore_rpc.py", line 1450, in __get_hook
    self.check_rpc_success(rpc)
  File "/root/appscale/AppServer/google/appengine/datastore/datastore_rpc.py", line 1224, in check_rpc_success
    raise _ToDatastoreError(err)
InternalError: Datastore connection error on get.
INFO     2016-06-01 09:26:01,525 server.py:580] default: "GET /status/refresh HTTP/1.0" 200 17

xybrek

unread,
Jun 1, 2016, 7:12:04 AM6/1/16
to AppScale Community
I also checked the /root/.appscale/log-* file

And I can see some more errors:

stacktrace : Traceback (most recent call last):
  File "/usr/local/appscale-tools/bin/appscale", line 82, in <module>
    appscale.status()
  File "/usr/local/appscale-tools/bin/../lib/appscale.py", line 456, in status
    AppScaleTools.describe_instances(options)
  File "/usr/local/appscale-tools/bin/../lib/appscale_tools.py", line 182, in describe_instances
    for ip in login_acc.get_all_public_ips():
  File "/usr/local/appscale-tools/bin/../lib/appcontroller_client.py", line 175, in get_all_public_ips
    self.server.get_all_public_ips, self.secret)
  File "/usr/local/appscale-tools/bin/../lib/appcontroller_client.py", line 125, in run_with_timeout
    function, *args)
  File "/usr/local/appscale-tools/bin/../lib/appcontroller_client.py", line 125, in run_with_timeout
    function, *args)
  File "/usr/local/appscale-tools/bin/../lib/appcontroller_client.py", line 125, in run_with_timeout
    function, *args)
  File "/usr/local/appscale-tools/bin/../lib/appcontroller_client.py", line 125, in run_with_timeout
    function, *args)
  File "/usr/local/appscale-tools/bin/../lib/appcontroller_client.py", line 125, in run_with_timeout
    function, *args)
  File "/usr/local/appscale-tools/bin/../lib/appcontroller_client.py", line 127, in run_with_timeout
    raise exception
error: [Errno 111] Connection refused

exception : error

locale : en_US

tools_version : 2.8.0

platform : Linux-3.16.0-30-generic-x86_64-with-Ubuntu-14.04-trusty

message : [Errno 111] Connection refused

runtime : CPython

Meni Vaitsi

unread,
Jun 2, 2016, 11:55:04 PM6/2/16
to appscale_community
Hi there,

The AppScale dashboard is storing monitoring information about your deployment in the database.
It seems that there's an error while talking to the database.

In long running deployments the database (cassandra) will start a compaction to save space and for that period of time it is under enough stress to cause timeouts and delays in responding. Typically if your deployment has 2 or more database machines this should not be a problem.

How many database nodes is your deployment using at the moment?

Do you see the same errors in your application log (/var/log/appscale/app___<app_ID>.log)?

-Meni

--
Meni Vaitsi
Software Engineer
AppScale Systems, Inc.

--
You received this message because you are subscribed to the Google Groups "AppScale Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to appscale_commun...@googlegroups.com.
To post to this group, send email to appscale_...@googlegroups.com.
Visit this group at https://groups.google.com/group/appscale_community.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages