disruption for a couple of hours, have no idea what was the problem

48 views
Skip to first unread message

Skirmantas Jurgaitis

unread,
Jan 29, 2012, 5:09:54 PM1/29/12
to google-a...@googlegroups.com

Today my app had a serious problems for a couple of hours. I am trying to understand what went wrong and how to prevent this in future. I think this was a problem of appengine and memcache service was not responding for a while. I am making a conclusion like this because in logs I find a big number of requests which died like this:

<...> 
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/memcache/__init__.py", line 619, in __get_hook
rpc.check_success()
<...>
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/apiproxy_rpc.py", line 119, in Wait
rpc_completed = self._WaitImpl()
File "/base/python_runtime/python_lib/versions/1/google/appengine/runtime/apiproxy.py", line 131, in _WaitImpl
rpc_completed = _apphosting_runtime___python__apiproxy.Wait(self)
DeadlineExceededError

However http://code.google.com/status/appengine does not show any problems. So I am not sure.

Here is todays graph with milliseconds/request:


If this was a problem of appengine how can I prove that and how can I get a refund for resources wasted by many instances which where started during these hours?

Brandon Wirtz

unread,
Jan 30, 2012, 4:08:18 AM1/30/12
to google-a...@googlegroups.com

You on HR or MS? 

 

Brandon Wirtz
BlackWaterOps: President / Lead Mercenary

Description: http://www.linkedin.com/img/signature/bg_slate_385x42.jpg

Work: 510-992-6548
Toll Free: 866-400-4536

IM: dra...@gmail.com (Google Talk)
Skype: drakegreene
YouTube: BlackWaterOpsDotCom

BlackWater Ops

Cloud On A String Mastermind Group


 

 

From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of Skirmantas Jurgaitis
Sent: Sunday, January 29, 2012 2:10 PM
To: google-a...@googlegroups.com
Subject: [google-appengine] disruption for a couple of hours, have no idea what was the problem

 

Today my app had a serious problems for a couple of hours. I am trying to understand what went wrong and how to prevent this in future. I think this was a problem of appengine and memcache service was not responding for a while. I am making a conclusion like this because in logs I find a big number of requests which died like this:

<...> 

File "/base/python_runtime/python_lib/versions/1/google/appengine/api/memcache/__init__.py", line 619, in __get_hook
rpc.check_success()
<...>
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/apiproxy_rpc.py", line 119, in Wait
rpc_completed = self._WaitImpl()
File "/base/python_runtime/python_lib/versions/1/google/appengine/runtime/apiproxy.py", line 131, in _WaitImpl
rpc_completed = _apphosting_runtime___python__apiproxy.Wait(self)
DeadlineExceededError

 

However http://code.google.com/status/appengine does not show any problems. So I am not sure.

Here is todays graph with milliseconds/request:

 

Description: Image removed by sender.

If this was a problem of appengine how can I prove that and how can I get a refund for resources wasted by many instances which where started during these hours?

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/Zk_oy5OOl6oJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

image001.jpg
image002.jpg

Skirmantas Jurgaitis

unread,
Jan 31, 2012, 9:04:23 AM1/31/12
to google-a...@googlegroups.com
You on HR or MS?

High replication.

Robert Kluin

unread,
Feb 1, 2012, 1:38:54 AM2/1/12
to google-a...@googlegroups.com
Hi,
I don't remember if memcache is covered under the SLA, I don't think
it was going to be originally.

According to the SLA docs
(http://code.google.com/appengine/sla.html), you can submit a request
using this form:
https://docs.google.com/spreadsheet/formResponse?hl=en_US&formkey=dFk2SEVTd0lrS1d1M2I0S3Q5eFlWQnc6MQ

To help reduce the impact from this type of event in the future, you
could do two things:
1) set a deadline on the memcache rpc call.
http://code.google.com/appengine/docs/python/memcache/clientclass.html#Client_create_rpc
2) catch and appropriately handle deadline errors around your
memcache calls.


Robert

On Tue, Jan 31, 2012 at 09:04, Skirmantas Jurgaitis <sky...@gmail.com> wrote:
>> You on HR or MS?
>
> High replication.
>

> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit

> https://groups.google.com/d/msg/google-appengine/-/Bj3XtnXOzr4J.

Reply all
Reply to author
Forward
0 new messages