Understanding "Request was aborted after waiting too long"

1,356 views
Skip to first unread message

ckhan

unread,
Nov 7, 2012, 11:09:48 PM11/7/12
to google-a...@googlegroups.com
Several requests over the last two days have failed with:

0.1.0.2 - - [06/Nov/2012:10:26:04 -0800] "POST /_ah/queue/deferred HTTP/1.1" 500 0 "http://myapp.appspot.com/my/url" "AppEngine-Google; (+http://code.google.com/appengine)"
1:1352226364.645785 Request was aborted after waiting too long to attempt to service your request.

I know this problem is reported and discussed often here - but I'm still unclear on precisely how to interpret it.

I'm trying to make sure I understand the relationship between idle instances,
pending latency, dynamic/resident instances, warmup requests and startup time.
My configuration:

   Idle instances : 4 - Automatic
   Pending latency: Autoomatic - Automatic

My app makes having use of deferred, at the time of the failure several
dozen tasks were posted to the app from taskqueue.

1. I'm going to start by assuming that while this request came to my app
   from the taskqueue (via the deferred library), this problem has nothing
   to do with the taskqueue per se.

2. The 500 means that this request in the Pending Queue:


      App Engine's scheduler is responsible for routing incoming
      requests to be served by your app's instances. Sometimes the
      volume of incoming requests exceeds the capacity of the
      instances currently available to your app. When this happens,
      incoming requests may have to wait in the Pending Queue until
      busy instances become available, or until the scheduler starts
      new instances.

3. So by that definition, there were only 3 ways out of the queue.
   After minimum pending latency, but before max, Scheduler does one of these:

   1. One of the 4 resident instances becomes idle, and get the request.

   2. One of the dynamic instances becomes idle, and gets the request.

   3. Scheduler spins up a new dynamic instance.
      
      3a. If the instance comes up in time, the request is sent there.
          
          As the 'inaugural request' to this instance, this request
          is known as a "loading request". 

          Your app handles the request, but its noticeably slower.
          You get the warning in the log:

          "This request caused a new process to be started for your
          application, and thus caused your application code to be
          loaded for the first time. This request may thus take longer
          and use more CPU than a typical request for your
          application."
          
      3b. If the instance does not come up in time, the request is
          aborted in the Pending Queue before the app ever sees it.

          You get the error in the log: 

          "Request was aborted after waiting too long to attempt to
          service your request.

The big questions I have :

1. Is my summary above accurate? Are there any other cases
   where "request was aborted after waiting too long" happens?

2. How long can you sit in the pending queue before you hit case #3b,
   and the request is aborted? Do I have any control over this value?

3. I don't have warmup requests configured. Would this have helped?
   If so, why? The scheduler has *real* requests waiting in the
   pending queue, why/when would it need to send me warmup ones?

And most importantly:
How can I tell the difference between :

   my instance took too long to come up because my app isn't optimized properly (ie, my problem)

   AND

   my instance took too long to come up because of something internal to GAE, entirely outside of my control
   (ie, an issue that should be reported GAE prod)

Thanks so much for any comments/pointers/responses.
-ckhan

Tomas

unread,
Nov 8, 2012, 3:13:32 PM11/8/12
to google-a...@googlegroups.com
+1

Strom

unread,
Nov 10, 2012, 4:28:08 AM11/10/12
to google-a...@googlegroups.com
Great post, I would like to know the definitive answers as well.

For your big question #2, I belive a request has a total of 60 seconds to return a response, and the timer against this 60 seconds starts when the request comes in. So both all the piping & actual request handling has to occur in this time.

sophe...@gae.golgek.mobi

unread,
Sep 4, 2013, 7:02:30 AM9/4/13
to google-a...@googlegroups.com
Hi,

It is good and really  use case. I also faced this issue as well. I also would like to see the answer from google appengine engineer how to resolve this .

For my use case :  client would like to have 10K to up requests (concerrent access) per second to the server google appEngine. I have tried to test with JMetter Tool, I got a lot of errors.

I would like to know with backend instance can help or not ?

Vinny P

unread,
Sep 5, 2013, 2:38:06 AM9/5/13
to google-a...@googlegroups.com
On Wed, Sep 4, 2013 at 6:02 AM, <sophe...@gae.golgek.mobi> wrote:
For my use case :  client would like to have 10K to up requests (concerrent access) per second to the server google appEngine. I have tried to test with JMetter Tool, I got a lot of errors.



App Engine can scale up and spawn instances as needed to handle incoming requests, so I wouldn't worry too much about 10k concurrent requests. 

In regards to testing with the JMeter tool, what kind of errors were you getting? Were they server-side (App Engine) or client side (JMeter)? If there were server side errors, can you supply the exact text of the error or a screenshot of the logs screen? If the error text is the same as the OP's, try decreasing the Pending Latency slider in the App Engine console.

 
 
-----------------
-Vinny P
Technology & Media Advisor
Chicago, IL

App Engine Code Samples: http://www.learntogoogleit.com
 

Reply all
Reply to author
Forward
0 new messages