Unusually high memcache latencies

315 views
Skip to first unread message

Rishi Arora

unread,
May 22, 2012, 11:59:49 AM5/22/12
to google-a...@googlegroups.com
My app usually consumes ~30 instance hours every day, and occasionally spikes to ~35 when Master-Slave datastore latencies go up.  We will be transitioning to HRD soon, but today large latency spikes have caused my instance hours to reach ~50 already, in the first 9 hours of the day.  I peeked at the appstats output to see where this spike is coming from, and it appears to be memcache (see attached pic).  This seems absurd and not something app operators should be expected to pay for.  Anybody else seen this?

Thanks in advance
- Rishi

Screen Shot 2012-05-22 at 10.50.11 AM.png

Rishi Arora

unread,
May 22, 2012, 12:01:25 PM5/22/12
to google-a...@googlegroups.com
Also, I imagine going to HRD would indirectly help my situation because I'll be forced to upgrade to python2.7.  So, while my instance is waiting forever for memcache to return some data, it would be available to service other requests, and keep the number of active instances low.  Am I correct in this reasoning?

Joshua Smith

unread,
May 22, 2012, 12:24:02 PM5/22/12
to google-a...@googlegroups.com
HRD does not require Py2.7

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Rishi Arora

unread,
May 22, 2012, 12:41:53 PM5/22/12
to google-a...@googlegroups.com
Ah, I was mistaken.  The restriction is the other way round.  Py2.7 requires HRD.

Gayle Laakmann

unread,
May 22, 2012, 3:04:31 PM5/22/12
to Google App Engine
I think I'm having the same issue.

How did you get that look at memcache latency specifically?

On May 22, 9:41 am, Rishi Arora <rishi.ar...@ship-rack.com> wrote:
> Ah, I was mistaken.  The restriction is the other way round.  Py2.7
> requires HRD.
>
> On Tue, May 22, 2012 at 11:24 AM, Joshua Smith <JoshuaESm...@charter.net>wrote:
>
>
>
>
>
>
>
> > HRD does not require Py2.7
>
> > On May 22, 2012, at 12:01 PM, Rishi Arora wrote:
>
> > Also, I imagine going to HRD would indirectly help my situation because
> > I'll be forced to upgrade to python2.7.  So, while my instance is waiting
> > forever for memcache to return some data, it would be available to service
> > other requests, and keep the number of active instances low.  Am I correct
> > in this reasoning?
>

Waleed Abdulla

unread,
May 22, 2012, 4:27:59 PM5/22/12
to google-a...@googlegroups.com
I'm noticing excessive datastore delays today (M/S), and generally a lot of API calls timing out. Might be related. Errors like:

DeadlineExceededError: The API call logservice.Flush() took too long to respond and was cancelled.

Also:

<class 'google.appengine.runtime.apiproxy_errors.DeadlineExceededError'>: The API call urlfetch.Fetch() took too long to respond and was cancelled.

Rishi Arora

unread,
May 22, 2012, 4:56:05 PM5/22/12
to google-a...@googlegroups.com
I used a tool call appstats.  I don't know if there's a java version of the tool, but here's the python version:
https://developers.google.com/appengine/docs/python/tools/appstats

Installation requires a slight modification to the app.yaml and appengine_config.py files.  The only other modification you might have to make is to reduce your memory footprint.  Enabling appstats caused a small percentage of my POST requests to exceed the soft memory limit.  I worked around that by selectively disabling appstats for those requests.

Hope this helps.

Rishi Arora

unread,
May 22, 2012, 4:59:17 PM5/22/12
to google-a...@googlegroups.com
Perhaps they're related, but Google generally doesn't do anything about datastore deadline exceeded errors.  They claim that you should either be using HRD or make your app tolerant of M/S data-store spikes by making asynchronous calls.  In my case, I studied appstat output thoroughly over a large sample, and the problem is specifically for memcache API calls.

FYI, here's a production issue I just logged, which you should consider starring, so that it gets attention:

code.google.com/p/googleappengine/issues/detail?id=7554

Waleed Abdulla

unread,
May 22, 2012, 5:27:26 PM5/22/12
to google-a...@googlegroups.com
I meant that the extra delays might not be just memcache, but an API delay in general because I'm noticing it with the logging service and the urlfetch service as well, as shown by the errors I listed. It's really hard to pinpoint these things because different apps have different use patterns so they might notice the problem in different ways. 

Rishi Arora

unread,
May 22, 2012, 5:31:38 PM5/22/12
to google-a...@googlegroups.com
Have you tried installing appstats?

Rishi Arora

unread,
May 22, 2012, 5:53:39 PM5/22/12
to google-a...@googlegroups.com
Doh!  Google marked this issue as "WontFix" blaming it as always on the M/S datastore.  Here's my comment on this issue that will hopefully reach someone who cares:

"You've GOT to be kidding me!!  Did you so much as glance at any logs to arrive at this conclusion?  Or did you just read the letters "M/S" and decided it must be the datastore?  Explain to us how M/S datastore latency also causes memcache latency. You guys seem to use M/S -> HRD migration way too leniently as a red herring explaining any issue that you can't explain otherwise.  This makes no sense at all.  Why is the M/S data-store even offered as an alternative if it causes so many problems completely unrelated to the datastore itself.  Sounds like a cheap trick to force people to move to HRD to satisfy some hidden agenda. I'm definitely sold on the HRD and its value to me as a developer.  But attributing such a vast number of issues to M/S without any proper investigation is unfair."

Filip Svendsen

unread,
May 23, 2012, 5:09:07 AM5/23/12
to Google App Engine
I'm seeing the same issue, also beginning on May 22.

We did an emergency upgrade to the HR datastore this morning in hopes
that this might fix our problem. Unsurprisingly, the memcache latency
problem persist.

BTW, it's not all memcache requests that are slow. Sometimes
everything will go back to being fine for up to an hour.

On May 22, 11:53 pm, Rishi Arora <rishi.ar...@ship-rack.com> wrote:
> Doh!  Google marked this issue as "WontFix" blaming it as always on the M/S
> datastore.  Here's my comment on this issue that will hopefully reach
> someone who cares:
>
> "You've GOT to be kidding me!!  Did you so much as glance at any logs to
> arrive at this conclusion?  Or did you just read the letters "M/S" and
> decided it must be the datastore?  Explain to us how M/S datastore latency
> also causes memcache latency. You guys seem to use M/S -> HRD migration way
> too leniently as a red herring explaining any issue that you can't explain
> otherwise.  This makes no sense at all.  Why is the M/S data-store even
> offered as an alternative if it causes so many problems completely
> unrelated to the datastore itself.  Sounds like a cheap trick to force
> people to move to HRD to satisfy some hidden agenda. I'm definitely sold on
> the HRD and its value to me as a developer.  But attributing such a vast
> number of issues to M/S without any proper investigation is unfair."
>
> On Tue, May 22, 2012 at 4:31 PM, Rishi Arora <rishi.ar...@ship-rack.com>wrote:
>
>
>
>
>
>
>
> > Have you tried installing appstats?
>
> > On Tue, May 22, 2012 at 4:27 PM, Waleed Abdulla <wal...@ninua.com> wrote:
>
> >> I meant that the extra delays might not be just memcache, but an API
> >> delay in general because I'm noticing it with the logging service and the
> >> urlfetch service as well, as shown by the errors I listed. It's really hard
> >> to pinpoint these things because different apps have different use patterns
> >> so they might notice the problem in different ways.
>
> >> On Tue, May 22, 2012 at 1:59 PM, Rishi Arora <rishi.ar...@ship-rack.com>wrote:
>
> >>> Perhaps they're related, but Google generally doesn't do anything about
> >>> datastore deadline exceeded errors.  They claim that you should either be
> >>> using HRD or make your app tolerant of M/S data-store spikes by making
> >>> asynchronous calls.  In my case, I studied appstat output thoroughly over a
> >>> large sample, and the problem is specifically for memcache API calls.
>
> >>> FYI, here's a production issue I just logged, which you should consider
> >>> starring, so that it gets attention:
>
> >>> code.google.com/p/googleappengine/issues/detail?id=7554
>
> >>> On Tue, May 22, 2012 at 3:27 PM, Waleed Abdulla <wal...@ninua.com>wrote:
>
> >>>> I'm noticing excessive datastore delays today (M/S), and generally a
> >>>> lot of API calls timing out. Might be related. Errors like:
>
> >>>> *DeadlineExceededError: The API call logservice.Flush() took too long
> >>>> to respond and was cancelled.*
>
> >>>> Also:
>
> >>>> *<class
> >>>> 'google.appengine.runtime.apiproxy_errors.DeadlineExceededError'>: The API
> >>>> call urlfetch.Fetch() took too long to respond and was cancelled.*
Reply all
Reply to author
Forward
0 new messages