Do we have an ETA for the resolution of ongoing performance issues.

1,922 views
Skip to first unread message

Tim Hoffman

unread,
Sep 14, 2010, 9:49:43 PM9/14/10
to Google App Engine
Hi

http://groups.google.com.au/group/google-appengine-downtime-notify/browse_thread/thread/9cf3b0cafdd6c235
was posted several hours ago, with no updates.

I certainly am experiencing significant ongoing issues with taskqueues
and datastore timeouts and half the time can't get to the dashboard.

I know someone must be working hard on this, but a little more detail
on progress, ie ETA to recovery would be really great

Thanks

Tim

Ikai Lan (Google)

unread,
Sep 15, 2010, 12:17:23 AM9/15/10
to google-a...@googlegroups.com
Hi Tim, 

You can track the progress here:


It's pretty hard to give an ETA, but we'd like to resolve this as soon as possible. We're seeing signs that the issues may have subsided, but we'd like a bit more confidence before giving the all clear.


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.


Raymond C.

unread,
Sep 15, 2010, 1:29:45 AM9/15/10
to Google App Engine
As of 10:01pm (PDT, log msg time), my app is still generating a lot of
deadline exceeded error. It happens like every 30min (not able to
backtrace too far since the log viewer is broken for me after paging)
and when it happens, all db put requests in that minute or two prompt
the error.

On Sep 15, 12:17 pm, "Ikai Lan (Google)" <ikai.l+gro...@google.com>
wrote:
> Hi Tim,
>
> You can track the progress here:
>
> http://groups.google.com/group/google-appengine-downtime-notify/brows...
>
> It's pretty hard to give an ETA, but we'd like to resolve this as soon as
> possible. We're seeing signs that the issues may have subsided, but we'd
> like a bit more confidence before giving the all clear.
>
>
>
> On Tue, Sep 14, 2010 at 6:49 PM, Tim Hoffman <zutes...@gmail.com> wrote:
> > Hi
>
> >http://groups.google.com.au/group/google-appengine-downtime-notify/br...
> > was posted several hours ago, with no updates.
>
> > I certainly am experiencing significant ongoing issues with taskqueues
> > and datastore timeouts and half the time can't get to the dashboard.
>
> > I know someone must be working hard on this, but a little more detail
> > on progress, ie ETA to recovery would be really great
>
> > Thanks
>
> > Tim
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Google App Engine" group.
> > To post to this group, send email to google-a...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > google-appengi...@googlegroups.com<google-appengine%2Bunsubscrib e...@googlegroups.com>
> > .

Tim Hoffman

unread,
Sep 15, 2010, 2:03:52 AM9/15/10
to Google App Engine
Hi Ikai

I have been watching that, but it hadn't been updated for a very long
time. (It has now though)
There just seemed to be deathly silence for over 4 hours ;-)

T

On Sep 15, 12:17 pm, "Ikai Lan (Google)" <ikai.l+gro...@google.com>
wrote:
> Hi Tim,
>
> You can track the progress here:
>
> http://groups.google.com/group/google-appengine-downtime-notify/brows...
>
> It's pretty hard to give an ETA, but we'd like to resolve this as soon as
> possible. We're seeing signs that the issues may have subsided, but we'd
> like a bit more confidence before giving the all clear.
>
>
>
>
>
>
>
> On Tue, Sep 14, 2010 at 6:49 PM, Tim Hoffman <zutes...@gmail.com> wrote:
> > Hi
>
> >http://groups.google.com.au/group/google-appengine-downtime-notify/br...
> > was posted several hours ago, with no updates.
>
> > I certainly am experiencing significant ongoing issues with taskqueues
> > and datastore timeouts and half the time can't get to the dashboard.
>
> > I know someone must be working hard on this, but a little more detail
> > on progress, ie ETA to recovery would be really great
>
> > Thanks
>
> > Tim
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Google App Engine" group.
> > To post to this group, send email to google-a...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > google-appengi...@googlegroups.com<google-appengine%2Bunsubscrib e...@googlegroups.com>
> > .

Tonny

unread,
Sep 15, 2010, 3:37:32 AM9/15/10
to Google App Engine
Ditto, I have seen an unusual higt number of timeout the last 24
hours. Usually i see 2-4 a da - i think I'm past 40 by now.

/Tonny

Kenneth

unread,
Sep 15, 2010, 4:16:02 AM9/15/10
to Google App Engine
This is still ongoing this morning. Come on Google, App Engine has
been impaired now for 24 hours. This is really really shocking. Take
the whole thing down for an hour if you have to, just please please
stop with this limping along throwing random errors thing.

Thanks.

Alexander M

unread,
Sep 15, 2010, 7:05:51 AM9/15/10
to Google App Engine
I agree ... I am seeing all sorts of errors that are due to timouts,
and I also have problems accessing the dashboard. It has been unstable
for over 24 hours now (since the datastore maintenance?)

-Alex

Arny

unread,
Sep 15, 2010, 7:13:15 AM9/15/10
to Google App Engine
We're still getting a lot of 500s (dashboard & front end).

Did you transferred our apps to lower-cost servers or why is
everything working that bad since the maintenance?
When are the REAL paid services coming?

On Sep 15, 6:17 am, "Ikai Lan (Google)" <ikai.l+gro...@google.com>
wrote:
> Hi Tim,
>
> You can track the progress here:
>
> http://groups.google.com/group/google-appengine-downtime-notify/brows...
>
> It's pretty hard to give an ETA, but we'd like to resolve this as soon as
> possible. We're seeing signs that the issues may have subsided, but we'd
> like a bit more confidence before giving the all clear.
>
>
>
> On Tue, Sep 14, 2010 at 6:49 PM, Tim Hoffman <zutes...@gmail.com> wrote:
> > Hi
>
> >http://groups.google.com.au/group/google-appengine-downtime-notify/br...
> > was posted several hours ago, with no updates.
>
> > I certainly am experiencing significant ongoing issues with taskqueues
> > and datastore timeouts and half the time can't get to the dashboard.
>
> > I know someone must be working hard on this, but a little more detail
> > on progress, ie ETA to recovery would be really great
>
> > Thanks
>
> > Tim
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Google App Engine" group.
> > To post to this group, send email to google-a...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > google-appengi...@googlegroups.com<google-appengine%2Bunsubscrib e...@googlegroups.com>
> > .

Cameron

unread,
Sep 15, 2010, 12:10:59 PM9/15/10
to Google App Engine
Hi Ikai -

Just to be clear the issues have NOT subsided since last night. I
hope you guys are working on eliminating the root of the problem and
not just "monitoring the service closely" as the latest App Engine
Notify post suggests.

http://groups.google.com/group/google-appengine-downtime-notify/browse_thread/thread/9cf3b0cafdd6c235

And even though the status page says "this spike did not affect the
performance or uptime of applications" - every spike DOES in fact
affect the performance of applications (mine at least - GQueues, but
probably all apps). These red spikes make my app inaccessible. Even
the yellow spikes cause many 500 errors. Basically this makes my app
unusable, because users can't get any consistent work done with the
frequent errors.

http://code.google.com/status/appengine/detail/datastore/2010/09/15#ae-trust-detail-datastore-get-latency

Most of all, the frequent errors make my app seem very brittle and
deteriorates user confidence. Sales drop and my own forum gets lots
of complaints. And then of course people start posting on Twitter.

Anyway, I'm sure you guys are working very hard to fix the issues and
want App Engine to be as reliable as possible. My suggestion is that
you also look to improve communication during these times. I have to
respond to my own users during these situations. This becomes very
difficult when all I can tell them is "Google thinks the issue is
resolved and is just monitoring the situation" when clearly the status
graphs indicate otherwise and people can't access my app. Or "Google
says the issue didn't affect performance or uptime" when clearly it
has.

-Cameron

Ikai Lan (Google)

unread,
Sep 15, 2010, 5:32:09 PM9/15/10
to google-a...@googlegroups.com
Hey guys,

I understand your frustration. We have many Google services deployed on top of App Engine as well, and we get pressure from both sides anytime events impact production. There are several issues around communication being highlighted here:

- App Engine Status page was being updated when we were having latency problems
- Status page did not accurately describe the impact
- Delay between when we recognized a production event and posting to downtime notify
- Downtime-notify emails are being marked as spam

We've attempted to take steps in the past to resolve the spam issue, though users are reporting that it hasn't worked. As far as the delay between us accurately identifying a production event and updating the groups - well, we'll have to figure out how we can minimize that. At the very least, there will be a communications post-mortem internally about what we plan on doing to address or at least minimizing the impact of these issues. 

The option to go into an unplanned maintenance period was on the table, but it was one of those situations where we assessed it as overkill, especially since there were period when the latency appeared to have died down, only to restart again. We don't want to be too trigger happy with unscheduled downtime, as degraded performance is usually preferable to a completely downtime state (many of you may disagree with me, but this is a bit of a judgment call depending on the level of degradation). We're cautiously optimistic at the moment about performance, but at this point, if the spikes begin appearing again, we may initiate another full downtime period. Stay tuned to the downtime-notify list. I'll let my team members know to post using their @google.com accounts to avoid being marked as spam.

To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

Cameron

unread,
Sep 16, 2010, 1:24:11 AM9/16/10
to Google App Engine
Ikai,

Thanks for listening to and considering our feedback. I saw several
of the spike descriptions on the status page were already updated
earlier today to be more accurate. I have full confidence that Google
engineers work as quickly as possible to resolve the issues during
these periods of disruptions. Having good communication from you guys
acknowledging the severity of the problem and the efforts being taken
to fix it goes a long way towards being able to handle the situation
appropriately, both for myself and the users of my app.

Thanks for your hard work.

-Cameron

On Sep 15, 4:32 pm, "Ikai Lan (Google)" <ikai.l+gro...@google.com>
wrote:
> >http://groups.google.com/group/google-appengine-downtime-notify/brows...
>
> > And even though the status page says "this spike did not affect the
> > performance or uptime of applications" - every spike DOES in fact
> > affect the performance of applications (mine at least - GQueues, but
> > probably all apps).  These red spikes make my app inaccessible.  Even
> > the yellow spikes cause many 500 errors.  Basically this makes my app
> > unusable, because users can't get any consistent work done with the
> > frequent errors.
>
> >http://code.google.com/status/appengine/detail/datastore/2010/09/15#a...
>
> > Most of all, the frequent errors make my app seem very brittle and
> > deteriorates user confidence.  Sales drop and my own forum gets lots
> > of complaints.  And then of course people start posting on Twitter.
>
> > Anyway, I'm sure you guys are working very hard to fix the issues and
> > want App Engine to be as reliable as possible.  My suggestion is that
> > you also look to improve communication during these times.  I have to
> > respond to my own users during these situations.  This becomes very
> > difficult when all I can tell them is "Google thinks the issue is
> > resolved and is just monitoring the situation" when clearly the status
> > graphs indicate otherwise and people can't access my app.  Or "Google
> > says the issue didn't affect performance or uptime" when clearly it
> > has.
>
> > -Cameron
>
> > On Sep 15, 6:13 am, Arny <arny...@googlemail.com> wrote:
> > > We're still getting a lot of 500s (dashboard & front end).
>
> > > Did you transferred our apps to lower-cost servers or why is
> > > everything working that bad since the maintenance?
> > > When are the REAL paid services coming?
>
> > > On Sep 15, 6:17 am, "Ikai Lan (Google)" <ikai.l+gro...@google.com<ikai.l%2Bgro...@google.com>
> > > > > google-appengi...@googlegroups.com<google-appengine%2Bunsubscrib e...@googlegroups.com><google-appengine%2Bunsubscrib

Jason C

unread,
Sep 16, 2010, 8:59:28 AM9/16/10
to Google App Engine
I completely agree Cameron.

I know that Google will be working hard to correct any problems.

However, unless I see communication to this effect, I'm am not sure
that Google is _aware_ there is a problem - more so if the App Engine
status page is not indicating a problem.

This forces me to sort of panic post on this group. A few pieces of
timely communication would prevent me from doing this.

j

Ikai Lan (Google)

unread,
Sep 16, 2010, 10:00:30 AM9/16/10
to google-a...@googlegroups.com
Constructive posts are always helpful, especially if numbers are provided, so we welcome you posted when there are performance issues. Aggregate statistics are a 95% case in terms of getting us useful information - we may be missing localized phenomenons. 

There are reports of issues right now. We're investigating these and will update downtime-notify if we decide to take more drastic action to address the short term issues.

To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages