Even the HR Datastore is (slightly) having performance issue now.

256 views
Skip to first unread message

Raymond C.

unread,
Sep 1, 2011, 11:30:00 PM9/1/11
to google-a...@googlegroups.com

Compare it to half a year before:

It seems to me that HRD was so stable because there was not much applications on it (just like back in the day M/S was not having so many downtime and performance issues)

Cameron Corda

unread,
Sep 2, 2011, 12:54:05 AM9/2/11
to google-a...@googlegroups.com
Will datastore latency be covered by the SLA?  All I'm seeing online is an old draft SLA:

Kenneth

unread,
Sep 2, 2011, 2:44:33 AM9/2/11
to google-a...@googlegroups.com
So you're saying stay on MS since it will become super stable when everyone has migrated to HR?  :-)


Bringing this back to billing, there's no sla on the datastore, so spikes like that are certainly making google money, which seems perverse.

Raymond C.

unread,
Sep 2, 2011, 4:10:30 AM9/2/11
to google-a...@googlegroups.com
No, actually I migrated to HR earlier (was an awful experience with 30+hrs downtime.  trial run took 8hrs only with the same data).

I am just trying to give an signal to everyone before something bad could hit us again.  Maybe GAE team could do sth before it happen.

Mike Wesner

unread,
Sep 2, 2011, 9:02:33 AM9/2/11
to Google App Engine
I would be interested in what the gae team has to say about this. It
seems exactly the same as the evolution of the master slave data
store. Its latency creeped up over the first few years and became
pretty unstable. Definitely concerning.

Mike

On Sep 1, 9:30 pm, "Raymond C." <windz...@gmail.com> wrote:
> http://code.google.com/status/appengine/detail/hr-datastore/2011/09/0...
>
> Compare it to half a year before:http://code.google.com/status/appengine/detail/hr-datastore/2011/03/0...

Tim Hoffman

unread,
Sep 2, 2011, 9:16:26 AM9/2/11
to google-a...@googlegroups.com
Hi

Actually I would disagree on the M/S front, all of my apps are on M/S and even despite the recent issues
M/S is way more stable than it was in 2008 and 2009, I mean WAY MORE STABLE ;-)

Thats not to say more growing pains won't be felt.  My guess is with a supposed imminent exodus of apps, we may say some 
performance improvements across the board ;-)

Rgds

Tim

Mike Wesner

unread,
Sep 2, 2011, 9:23:11 AM9/2/11
to Google App Engine
Yes, I agree that it won't (can't) get as bad as m/s did get because
HRD is awesome. My concern is mostly over how they provision and that
the latency is that much higher than it was. Why does the load have
such a big impact on latency of datastore query?

We keep very detailed statistics about the latency of our services and
its clear by our average latencies that it has creeped up over time.
It is roughly 200ms higher than it was when we first moved over to
HRD.

Mike

Pol

unread,
Sep 2, 2011, 11:15:18 AM9/2/11
to Google App Engine
This is really scary. I'm not so obsessed about the price increase as
this is something you can "control", but if the HRD starts becoming
slower and slower, there's absolutely nothing you can do. According to
these official graphs, it's literally 2X slower than 1 year ago. I'm
OK with training some speed against increase reliability, the promise
of HRD, but at some point, you need to draw a line: what tells us it's
not going to take 500ms to do basic queries a year from now?

App Engine team: what's your explanation for this? Are you going to
put clauses in the SLA to guarantee max average get / put / query
latencies? Seems to me that if you want to go "enterprise-focus"
that's needed.

Vivek Puri

unread,
Sep 2, 2011, 11:47:16 AM9/2/11
to Google App Engine
I will disagree with this post. We were on MS earlier and now on HRD.
While on MS, we used to have datastore timeouts all the time and it
was driving me insane looking at 20-30k error emails come in. After
the move to HRD, everything is stable and there are no datastore
timeouts anymore. I am happy with the move.

Robert Kluin

unread,
Sep 2, 2011, 12:10:30 PM9/2/11
to google-a...@googlegroups.com, Ikai Lan (Google)
I think the point of this post is the trend. At one time master-slave
also performed quite well, then we'd see trends where latency
increased until a blowup. These graphs show a similar pattern to
that: increasing latencies over time. We would like to know that: 1)
the trend won't continue, 2) we won't start seeing massive blowups.

As for the SLA, see the exclusions section. SLAs are fairly worthless
in my honest opinion anyway, so this probably isn't really something
to worry about though.
http://code.google.com/appengine/sla.html


Robert

> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
>
>

JH

unread,
Sep 2, 2011, 1:02:14 PM9/2/11
to Google App Engine
I agree on both fronts.

Yes, M/S was a JOKE. Once moving to HRD I have not seen a datastore
timeout at all.
Until earlier this week I saw a small series of them. I really hope
we are not going back to what we are all used to on M/S. The old M/S
problem days were always written off as "well, it's good enough for a
free service." Now this is hardly free service. In fact it is top
dollar service, so I'd hope that datastore issue DO NOT RETURN.

Pol

unread,
Sep 2, 2011, 4:14:01 PM9/2/11
to Google App Engine
Our app has been on HRD since the beginning and indeed, the past few
days for the time ever, I saw some timeouts. To be clear, not timeouts
due to contention which we get as expected now and then on
transactions, but timeout on the db itself.

Our usage patterns haven't changed though, so it would expect it to be
on GAE's end.

Pol

unread,
Sep 5, 2011, 7:02:56 PM9/5/11
to Google App Engine
Reviving this thread... GAE team: can we get an official comment on
this really important issue? Thx

Raymond C.

unread,
Sep 5, 2011, 9:36:41 PM9/5/11
to google-a...@googlegroups.com
Especially the datastore is getting quite a number of timeout these two days:

John Patterson

unread,
Sep 5, 2011, 11:37:18 PM9/5/11
to google-a...@googlegroups.com, Ikai Lan (Google)


On Friday, 2 September 2011 23:10:30 UTC+7, Robert Kluin wrote:

As for the SLA, see the exclusions section.  SLAs are fairly worthless
in my honest opinion anyway, so this probably isn't really something
to worry about though.
  http://code.google.com/appengine/sla.html


The SLA does not really offer me any peace of mind.  If the datastore slows down to the point where requests are taking 5 seconds to return then my app is as good as dead.  But the SLA doesn't cover performance problems - only errors.  If my app becomes so slow that it starts to throw DEE's then that is probably considered a problem with "the application code" rather than an infrastructure error.

Simon Knott

unread,
Sep 6, 2011, 8:11:48 AM9/6/11
to google-a...@googlegroups.com

I've attached graphs showing the daily stats for each of the HR operations for the last year, driven from the GAE Status stats.

I'll grab the M/S datastore statistics, when I get the time, to give a comparison.







Mike Wesner

unread,
Sep 6, 2011, 9:57:52 AM9/6/11
to Google App Engine
Thanks for putting those together. It very clearly shows that
recently things have changed.

It would be nice to hear from Greg or someone about this.

-Mike

On Sep 6, 7:11 am, Simon Knott <knott.si...@gmail.com> wrote:
> I've attached graphs showing the daily stats for each of the HR operations
> for the last year, driven from the GAE Status stats.
> I'll grab the M/S datastore statistics, when I get the time, to give a
> comparison.
>
> <https://lh4.googleusercontent.com/-MV_ixL3Jz0Y/TmYNsX9QEGI/AAAAAAAAAF...>
>
> <https://lh4.googleusercontent.com/-6yrWNd7wfxU/TmYNoldrFWI/AAAAAAAAAF...>
>
> <https://lh6.googleusercontent.com/-NiRHqyRtbro/TmYNgZLeRXI/AAAAAAAAAF...>
>
> <https://lh3.googleusercontent.com/-WGYYZUzXiAg/TmYNk35ZztI/AAAAAAAAAF...>
>
> <https://lh5.googleusercontent.com/-5GRqG-Sy17A/TmYN7E7C60I/AAAAAAAAAF...>

Jason Collins

unread,
Sep 6, 2011, 12:17:08 PM9/6/11
to Google App Engine
Those trends are concerning. We're putting a large effort in to
convert our app over to HR. Performance consistency is the only reason
we're doing this.
j

Gregory D'alesandre

unread,
Sep 6, 2011, 2:43:53 PM9/6/11
to google-a...@googlegroups.com
Hello All,

This is an issue right now related to higher latency on HRD, particularly affecting queries.  We are working on potential fixes to mitigate the issue.  Once we have determined the full scope of it, we will provide further information.  This is not expected behavior for HRD and is indeed a recent issue, it has not been a slow steady increase based on load or a fundamental issue with the way HRD is built.  The issue with M/S was that it heavily relied on a fundamentally inconsistent piece of infrastructure, this is not true with HRD.

Greg

Raymond C.

unread,
Sep 6, 2011, 9:54:56 PM9/6/11
to google-a...@googlegroups.com
Glad to hear Googler speak about it.  Thanks!

Pol

unread,
Sep 7, 2011, 11:19:55 AM9/7/11
to Google App Engine
Great to see the problem acknowledged, so why is the HDR line still
all green at http://code.google.com/status/appengine?

The HRD is having problems now (yes, not as bad pas a MS problems, but
that's beside the point as the expectancies have been officially
raised quite a bit for HRD), non-negligible problems, officially
acknowledged problems, so not putting it in the status page is
deceptive to say the least.

That line ought to be orange.

On Sep 6, 11:43 am, "Gregory D'alesandre" <gr...@google.com> wrote:
> Hello All,
>
> This is an issue right now related to higher latency on HRD, particularly
> affecting queries.  We are working on potential fixes to mitigate the issue.
>  Once we have determined the full scope of it, we will provide further
> information.  This is not expected behavior for HRD and is indeed a recent
> issue, it has *not* been a slow steady increase based on load or a
> fundamental issue with the way HRD is built.  The issue with M/S was that it
> heavily relied on a fundamentally inconsistent piece of infrastructure, this
> is not true with HRD.
>
> Greg
>

Gregory D'alesandre

unread,
Sep 8, 2011, 3:38:15 AM9/8/11
to google-a...@googlegroups.com
Hi Pol, that is unfortunately a bug in the status dashboard that we are working on...

Greg
Reply all
Reply to author
Forward
0 new messages