Any idea what causes this warning "Stats data was too big, all stack traces were removed"

107 views
Skip to first unread message

Damith C Rajapakse

unread,
Aug 21, 2016, 9:32:55 PM8/21/16
to Google App Engine
hi,
Our java app is getting a lot of this warning all of a sudden. The request completes successfully though. Any idea what kind of situation causes it or how to avoid it?

com.google.appengine.tools.appstats.MemcacheWriter logMaybe: Stats data was too big, all stack traces were removed

Nick (Cloud Platform Support)

unread,
Aug 22, 2016, 3:45:14 PM8/22/16
to Google App Engine
Hey Damith,

This appears to have been brought up and answered in a prior thread, which I found by googling the error. Let me know if you have any further questions!

Cheers,

Nick
Cloud Platform Community Support

Damith C Rajapakse

unread,
Aug 22, 2016, 9:32:29 PM8/22/16
to Google App Engine
Thanks for the reply Nick. Much appreciated.
I was wondering what made the warning appear all of a sudden (we did not have this warning for the last few years). If we can correct whatever is causing the app stats to be bigger than 1Mb, then we can continue to use app stats.

Nick (Cloud Platform Support)

unread,
Aug 23, 2016, 3:57:53 PM8/23/16
to Google App Engine
Hey Damith,

The error messages seems to suggest that a stack trace, or a collection of stack traces, were over 1MB in total. Maybe a long stack trace is getting repeated - do you have any retry logic using try-catch blocks?


Cheers,

Nick
Cloud Platform Community Support

Damith C. Rajapakse

unread,
Aug 25, 2016, 8:17:10 PM8/25/16
to google-a...@googlegroups.com
Thanks again for the help Nick. Yes, we are looking into reducing the stack traces. Again, this warning should not happen suddenly after so many years in operation. 
On a related noted, this and a few other errors (e.g. 104 errors) that came in a wave at the same time (possibly after the recent GAE update) feels like some dials on the GAE side was tweaked that affected various constraints applied on apps. For example, errors we we were previously able to handle by catching the DeadlineExceededError suddenly started giving 104 errors. As I've not seen too many reports of similar problems from other app devs, this situation is probably affecting only those apps that handle high-latency datastore-heavy requests (like our app).

Nick (Cloud Platform Support)

unread,
Sep 8, 2016, 4:29:53 PM9/8/16
to Google App Engine
Hey Damith,

I believe I can shed some light on this situation. As explained in the docs,

If the DeadlineExceededError is caught but a response is not produced quickly enough (you have less than a second), the request is aborted and a 500 internal server error is returned.

Another possible cause of 104 errors is:

In the Java runtime, if the DeadlineExceededException is not caught, an uncatchableHardDeadlineExceededError is thrown. The instance is terminated in both cases, but theHardDeadlineExceededError does not give any time margin to return a custom response. To make sure your request returns within the allowed time frame, you can use theApiProxy.getCurrentEnvironment().getRemainingMillis() method to checkpoint your code and return if you have no time left. The Java runtime page contains an explaination on how to use this method. If concurrent requests are enabled through the "threadsafe" flag, every other running concurrent request is killed with error code 104

In line with the first quoted documentation above, it's possible that the the Datastore latency (the kind which would have caused the Datastore calls to go so long that the request itself would be facing a deadline error) could be causing the AppStats writing of the DeadlineExceeded exception itself to go so long that the response is not produced "quickly enough", leading to the observed error.

In general, it appears that the code being written to handle a DeadlineExceeded error, while a good thing, can tend to provide buffer room for tolerating the system being very close to deadline often enough that a slight change to Datastore latency could cause a certain proportion of requests to fail. AppStats being used to capture exceptions, and the deadline limit itself being announced via exception (thus causing the AppStats machine to start working, possibly for too long a duration leading to an absolute crash), this can lead to some moderately complex failure scenarios.

I believe this entire class of errors would be avoided by taking a look at whatever long-running activity is causing the requests to run so close to the deadline, and shifting that activity to a Task Queue or other form of processing which doesn't take place directly within the App Engine request handler. Another option would be to switch to Basic scaling, which does not have a 60 second Deadline.


Cheers,

Nick
Cloud Platform Community Support

Damith C. Rajapakse

unread,
Sep 12, 2016, 1:20:00 PM9/12/16
to google-a...@googlegroups.com
Thanks Nick. Yes, that's our thinking too. At the moment we've removed all 'error handling' code that used to run after the DeadlneExceededException and also removed AppStats. Seems to work. I wish AppEngine gave a slightly longer window to clean up e.g 5 seconds.
Thanks again for taking time to give a helpful reply. Much appreciated.

--
You received this message because you are subscribed to a topic in the Google Groups "Google App Engine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-appengine/xXAJQLRLL-I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-appengine+unsubscribe@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/7a726dc0-2567-49d4-a3ca-71f490eccc78%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages