Sporadically terrible performance

458 views
Skip to first unread message

Devel63

unread,
Mar 18, 2015, 11:21:32 PM3/18/15
to google-a...@googlegroups.com
I haven't been paying close attention, but suddenly(?) today I'm seeing sporadically awful performance.  It took 20 seconds just to serve a static HTML page with no one else using this instance, and no cold start.  Normally this takes about 2ms, so 10000 times worse. Let alone my actual datastore interactions, which sometimes worked, sometimes took orders of magnitude longer than usual, and sometimes issued a 500 with no corresponding log entry.  What is going on with GAE?  Is anyone else seeing problems?  

Or is this just normal?

Ed

unread,
Mar 19, 2015, 3:18:14 AM3/19/15
to google-a...@googlegroups.com
This is unusual, but we are seeing the same behavior.

Our static handlers and trivial health-check handlers are seeing 30s+ latency. We've escalated with Google support, and we have an open ticket on the issue.

We'd been seeing strange behavior for a few days, including 30s+ on some modules that do nontrivial work. But static files and trivial handlers taking this long is new, and only started for us about a half hour ago.

Ed

Ed

unread,
Mar 19, 2015, 4:26:52 PM3/19/15
to google-a...@googlegroups.com
Fyi -- this issue was resolved for us around 1:15 am pacific (8:15 am UTC) this morning.

Ed

Kaan Soral

unread,
Mar 19, 2015, 5:26:07 PM3/19/15
to google-a...@googlegroups.com
As usual I've experienced a lot of aftermath confusion caused by missing push queue tasks

This kind of performance flukes are also pretty regular, yet they usually happen spontaneously to screw things up

Devel63

unread,
Mar 19, 2015, 5:53:49 PM3/19/15
to google-a...@googlegroups.com
Ed, was this something that Google fixed for you specifically, or part of a larger issue?  

I haven't seen the problem again today in my extremely limited testing, but frankly I have no idea what the real situation is.  I do see, though, that some interactions that "always" were very fast may need additional client logic to handle hugely slower interactions, and I'm wondering how often that occurs.  Occasionally on an ongoing basis, in batches on rare occasions, etc.  I was warned not to deploy on GAE in part for this reason, but I had hoped those issues were gone.

Ed

unread,
Mar 19, 2015, 7:28:04 PM3/19/15
to google-a...@googlegroups.com
As far as I know, it was a systemic issue, but I don't know how many people it affected.

I asked for a detailed explanation, but I'm told that the engineering investigation is still ongoing, so information is not yet available.

Ed

Kaan Soral

unread,
Mar 20, 2015, 5:11:44 PM3/20/15
to google-a...@googlegroups.com
I inspected the issue further, 50% of my issues were caused by failed ndb.transaction's - other half was left uncovered for me, but I'm guessing similar issues

I improved things on my end to handle such failures better
Reply all
Reply to author
Forward
0 new messages