Huge execution times. DeadlineExceededException(s). Crushing Weltschmerz.

92 views
Skip to first unread message

Cesium

unread,
Jun 1, 2012, 11:50:08 AM6/1/12
to google-a...@googlegroups.com
I am seeing wild execution times for requests that nominally take 500ms. E.g.,
xxx.xxx.xxx.xxx - - [01/Jun/2012:07:21:08 -0700] "POST /xxx/sdtp?t=ACCESS_POINT_APP_DATA_UPLOAD HTTP/1.1" 200 64 - - "xxx.appspot.com" ms=54346 cpu_ms=3858 api_cpu_ms=8 cpm_usd=0.107224 loading_request=1 pending_ms=1566 instance=00c61b117cfadbd3e42218864e6cc2ffe3449b

I'm also seeing DeadlineExceededException(s). E.g.,
com.google.apphosting.api.DeadlineExceededException: This request (41c5eb91df9afc7a) started at 2012/06/01 14:27:26.021 UTC and was still executing at 2012/06/01 14:28:25.488 UTC.
(that's a whole minute.)

The absence of any feedback from previous posts (see "Stop tearing down my instances!") has lead to overwhelming weltschmerz.

What's a brother gotta' do to get some love?
David


Takashi Matsuo

unread,
Jun 1, 2012, 12:11:29 PM6/1/12
to google-a...@googlegroups.com
Please file a production issue on our issue tracker:
http://code.google.com/p/googleappengine/issues/entry?template=Production%20issue
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/4_H4c0or0s8J.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo | Developer Advocate | tma...@google.com

Cesium

unread,
Jun 1, 2012, 3:59:16 PM6/1/12
to google-a...@googlegroups.com
Hi Takashi,

Thank you for your help!

I sent you an email with some details.

Cheers,
David

Takashi Matsuo

unread,
Jun 1, 2012, 6:51:20 PM6/1/12
to google-a...@googlegroups.com
Hi David,

First of all, it seems OK right now. Let me know that's not the case.

If you don't want to put information on the public issue tracker,
please use the "Report Production Issue" located near the right-top
corner of the Admin Console. The report tool will collect some useful
information and convey it to us secretly.

At the same time, I'd suggest using appstats in order to find out the
root cause of your latency first. If a certain RPC is the culprit,
there would be a decent workaround. Basically you can set a particular
deadline on the RPC call, and catch the deadline error and tell the
client to retry.

Besides that, please keep in mind that the performance of each loading
request(sure, actually every single request) may vary, because of
various reasons like actions of other customers in the same cluster,
the load on the system, or some kind of maintenance happening. App
Engine is a multi-tenant cloud platform.

Actually, in that timeframe, I can see that only loading requests of
your application had higher latencies.

Those kind of hiccups are supposed to be automatically addressed and
fixed swiftly(almost always faster than manual intervention like me
doing right now). Additionally, you can set "Min Idle Instances" to a
certain level in order to mitigate this kind of risk, although it will
certainly cost you some more money.

Please let me know if you have further questions.

-- Takashi
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/hD6FC_YSBlsJ.

Cesium

unread,
Jun 1, 2012, 7:07:40 PM6/1/12
to google-a...@googlegroups.com
Hi Takashi,

Thank you for your reply!

You have taken the time to personally look into my issues and for that I am grateful.

I promise to follow your suggestions. It may take me a couple of days to perform some tests.

I will let you know if I have new questions.

Best Regards,
David

Cesium

unread,
Jun 3, 2012, 9:08:28 AM6/3/12
to google-a...@googlegroups.com
Hi Takashi,

You wrote:

 Actually, in that timeframe, I can see that only loading requests of 
your application had higher latencies. 
Those kind of hiccups are supposed to be automatically addressed and 
fixed swiftly(almost always faster than manual intervention like me 
doing right now). Additionally, you can set "Min Idle Instances" to a 
certain level in order to mitigate this kind of risk, although it will 
certainly cost you some more money. 

Yes, the loading requests have become very disruptive to my application.

I am currently at Auto-Auto for the Idle Instances and Pending Latency settings.
There is 1 client connecting to my application, and it provides a light load. 
I cannot see from my logs when an instance shuts down, but I can see when one is started.

For example:


    1. 2012-06-03 06:25:12.259
       /xxx/sdtp?t=ACCESS_POINT_APP_DATA_UPLOAD 200 10716ms 0kb
      xxx.xxx.xxx.xxx - - [03/Jun/2012:05:25:12 -0700] "POST /xxx/sdtp?t=ACCESS_POINT_APP_DATA_UPLOAD HTTP/1.1" 200 64 - - "xxx.appspot.com" ms=10716 cpu_ms=5044 api_cpu_ms=708 cpm_usd=0.140171 loading_request=1 instance=00c61b117c977896b0d6d1161670c6b7518c43
    2. I2012-06-03 06:25:12.253
      This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application.
    1. 2012-06-03 06:22:19.694 /xxx/sdtp?t=AP_RTC_SYNC 200 34ms 0kb
      xxx.xxx.xxx.xxx - - [03/Jun/2012:05:22:19 -0700] "POST /xxx/sdtp?t=AP_RTC_SYNC HTTP/1.1" 200 72 - - "xxx.appspot.com" ms=35 cpu_ms=38 api_cpu_ms=0 cpm_usd=0.001107 instance=00c61b117c3d078ba2eaebba354307e44c0a15

These two, back-to-back entries show something odd.
The first request at 6:22:19 is handled by an existing instance, namely ...0a15.

The next request comes in 3 seconds later. However, it is not handled by instance ...0a15.
A new process is started, namely ...8c43.

You are correct, when you say:

Actually, in that timeframe, I can see that only loading requests of 
your application had higher latencies.  

My question is: "Why was instance ...8c43 created to serve the request, rather than using ...0a15?"

Cheers,
David

alex

unread,
Jun 3, 2012, 10:45:17 AM6/3/12
to google-a...@googlegroups.com
Could it be that your average response time became much higher than the spinup time? That might explain the new instance.

Also, have you tried running it with multithreading enabled?

Cesium

unread,
Jun 3, 2012, 12:21:11 PM6/3/12
to google-a...@googlegroups.com
Hi Alex,

First things first.

I'm an idiot.

The second reply comes in 3 MINUTES later, not 3 SECONDS.

Let me chew on your suggestions a bit...

David
PS (Can I edit a previous post, rather than deleting it?)

Reply all
Reply to author
Forward
0 new messages