Extremely dissatisfied with App Engine lately, Network Issues, Support doesn't help and loops

308 views
Skip to first unread message

Kaan Soral

unread,
Aug 7, 2018, 8:32:26 AM8/7/18
to Google App Engine
Basically, I experience network issues like what happened here: https://groups.google.com/forum/#!topic/google-appengine-downtime-notify/wiXNbETOYgA

On a regular basis, since if a request fails to reach App Engine, it's invisible to the logs etc. - It's impossible to detect or debug these issues either, I only learn about them from players/users

They are usually regional too, while one player experiences the issue for a long duration, let's say 1-2 hours, the others usually have no issues

When I open a support ticket, I'm denied any assistance as I'm on a Bronze level and refuse to purchase a package, and simply get directed to the Financial/SLA ticket url, and when I open an Financial/SLA Credit ticket, my claim is again denied as I need to have a Silver/Gold support package

About the issue itself, I believe it's impossible to debug, for example, Cloudflare marks each request with a unique ray ID, so if a network request fails, it's possible to get the issue debugged from it's ray ID, but since App Engine has no system, what you have is just reports of 500 (Internal Server Error)'s that don't show up on any logs

EPS Dev-Team

unread,
Aug 7, 2018, 1:01:17 PM8/7/18
to google-a...@googlegroups.com
+1

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/ec00a8f8-1367-40da-8bfa-19c9f308365b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Attila-Mihaly Balazs

unread,
Aug 8, 2018, 12:38:40 AM8/8/18
to Google App Engine
Just adding my two cents here: yes, debugging connectivity problems with AppEngine can be very frustrating - but in my book that's just the price I pay for not needing and IT department 99.999% of the time :)

Also, regarding your Cloudflare comment: you get the "x-cloud-trace-context" header in your GAE responses which can be used to find it in the logs.

All the best,
Attila

Kaan Soral

unread,
Aug 13, 2018, 9:44:02 AM8/13/18
to Google App Engine
Yes, but the main problem is requests not reaching the logs, or appengine

By the way, the new support strategy is now loop+exhaust

Whenever I open a support ticket, I'm being looped to another place, in my last attempt, I've been told this issue is not covered by SLA's: https://groups.google.com/forum/#!topic/google-appengine-downtime-notify/wiXNbETOYgA (full downtime, and not covered, bizarre) - and I need to open a ticket on the cloud platform sla page

Anyway, I'm exhausted

This is coming from someone who has been using App Engine for almost a year, I've been using App Engine in simpler and simpler ways over the times, as anything complex, and you hit issues that were hard to debug in the past, and with the current support levels, almost impossible, for my next project, I'll try not using App Engine at all, I don't know whether I can dump my experience/knowledge of the platform, but I'll certainly try, it's very saddening :(

I really wish we could regain the dynamism of the older days, where we got actual helped and found solutions to the problems, rather than passing the day and ignoring issues

Kaan Soral

unread,
Aug 13, 2018, 9:45:02 AM8/13/18
to Google App Engine
For almost 10** years


On Monday, August 13, 2018 at 4:44:02 PM UTC+3, Kaan Soral wrote:
Yes, but the main problem is requests not reaching the logs, or appengine

By the way, the new support strategy is now loop+exhaust

Whenever I open a support ticket, I'm being looped to another place, in my last attempt, I've been told this issue is not covered by SLA's: https://groups.google.com/forum/#!topic/google-appengine-downtime-notify/wiXNbETOYgA (full downtime, and not covered, bizarre) - and I need to open a ticket on the cloud platform sla page

Anyway, I'm exhausted

This is coming from someone who has been using App Engine for almost a year*, I've been using App Engine in simpler and simpler ways over the times, as anything complex, and you hit issues that were hard to debug in the past, and with the current support levels, almost impossible, for my next project, I'll try not using App Engine at all, I don't know whether I can dump my experience/knowledge of the platform, but I'll certainly try, it's very saddening :(

John Lowry

unread,
Aug 13, 2018, 2:50:23 PM8/13/18
to Google App Engine
Hi Kaan,

Sorry that you're experiencing these issues.

As Attila-Mihaly Balazs pointed out above, intermittent network issues are very difficult to debug.

I'm going to grab your support case (16573520) and will follow up privately to help you get to the bottom of this issue.

To help others who may be experiencing similar issues, it can be useful to understand the network path. In this case, it is:

Client -> ISP network -> Cloudflare -> Maglev (Google network load balancer) -> Google Frontend -> Cloud Load Balancer -> App Engine traffic router -> Appserver -> App

The server side logs are written by the Appserver, but requests can fail at any layer along the path above, and the errors won't necessarily get plumbed back to your app.

It helps to have good client-side monitoring to try to figure out the commonalities between the clients that are experiencing the problems. 

On the Google side, we have increasingly good logging and monitoring the further upstream the request reaches.

The troubleshooting methodology will be to try to get as much information from you on the problem experienced by clients and then try to figure out which layer of the stack is most likely to be the culprit. We can then dig into logs and monitoring at that layer in the stack.

For example, if the client gets an HTTP response from the GFE then it will be clearly labelled as such in the response headers and we know that the request reached Google's network.

If the client does not get an HTTP response then the problem most likely occurred before the GFE.

We recognize that this is a pain point for customers and are looking at ways to make this easier to debug.

Maximiliano Contartesi

unread,
Aug 13, 2018, 6:09:01 PM8/13/18
to google-a...@googlegroups.com
We also had that problem with cloudflare.

Is very diffuclt identify these problems with the proxy. 

For this reason we decided to remove cloudflare from all our services.



--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.

Kaan Soral

unread,
Aug 28, 2018, 3:50:37 AM8/28/18
to Google App Engine
An update

1) John Lowry adopted my issue and debugged it with me, he was very transparent and helpful and genuinely interested in helping

2) I improved my system, logged both Cloudflare wrapped and raw appspot.com 500 errors, verified that these 500 issues do happen, and they are invisible to the logs (it's not a cloudflare issue, easier to observe with appspot.com)

Result: If you are using App Engine, you might be losing traffic, losing customers, and as far as you and your logs are concerned, these issues are invisible - sadly, both from responses to issue, and my issue tracker threads (one example: https://issuetracker.google.com/issues/112668010 - there are other major/minor issues reported by me, none of them are solved, usually the responses are aimed to stonewall issues ) - None of these issues are being solved, no one is interested in improving anything it seems

This specific issue is in the network layer, so basically, the problem is that, it's not being propagated to App Engine logs, it's a major breach of trust in my opinion, I'm not complaining about errors, errors do happen, apps should adapt, but the problem is, if you can't observe errors, you can't adapt, I'm only able to log these issues by catching them (appengine is down when you catch them) - keep them for 40 seconds, and report them after 40 seconds, at which time, assumably the network is back up, so maybe my logging logic is failing, maybe the network isn't back up after 40 seconds, and I'm missing some of them too, I can't know, I likely need a 3rd party logging solution

The most saddening aspect is, I again invested a lot of my time and energy in debugging the issue, helping find the cause, trying my best, yet, has anything changed? Nope.

Katayoon (Cloud Platform Support)

unread,
Aug 28, 2018, 8:41:33 PM8/28/18
to Google App Engine

Hi Kaan,


As mentioned in your feature request #112668010, I have forwarded your request to Cloud App Engine product team to be evaluated, however there is no ETA or guarantee of implementation. All further updates should occur there.


Kaan Soral

unread,
Oct 12, 2018, 4:45:56 AM10/12/18
to Google App Engine
As a follow up, I've logged around 90 502/503 errors around `date: Thu, 11 Oct 2018 23:15:27 GMT` - Caused by multiple ~minutely outages, I personally experienced one of them, and checked my logs/noticed the issue, it's been happening a lot, but it's so much that I don't even bother to check any more

As usual, If you look at the App Engine error logs, you can't see any of these errors or any signs

At this point it's no different than a fraud, I've reported this issue more than 2 months ago, it should've been solved by now - I'm not saying errors can't or shouldn't happen, but they should be transparent, hiding errors and outages like this is a deception (my suggestion, add a new log level, something like a "Network" level error, and log these in that section, otherwise, these errors will flood everyone's logs)

I don't even bother making SLA requests any more, the minuscule amounts of compensations doesn't even make up for the time hunting these issues - and things don't get better, only worse

Harmit Rishi (Cloud Platform Support)

unread,
Oct 15, 2018, 12:06:45 PM10/15/18
to google-a...@googlegroups.com

Hello Kaan,


Thank you for following up on Google Groups. We appreciate the time and effort users take in providing the information to us for processing their requests.


After reviewing your filed issues of not being able to track error logs before App-Engine, I was able to verify that the Feature Request you had filed has been reopened. I would like to let you know that all features regardless of their support level are always taken seriously. The implementation time of requested features depends on its complexity, feasibility and popularity. Please see the following link for more information on what to expect when you’ve opened an issue.


You had discussed the support process in your initial message, I would like to reassure you that our flexible support extends all users. For instance, our Support Options page provides a overview of what we offer (Basic, Role-Based, Enterprise). You can navigate to the “Free Support Resources” section of the page. This section provides users the ability to access Support Documentation, Reference Guides, Community Support as well as Billing and Phone Billing.


Among the Free Support Resources discussed above, there is also a option called “Free Trial Technical Support”. Once you fill out the form and highlight which area you would like to receive Support in. You can usually expect the requested Support within 1 business day.


Our products and platforms rely on experienced users as yourself. Although the Feature Request (#112668010) has been reopened already. We encourage users to create Feature Requests and Issue Bug Reports as this will help us make our products better.  
Reply all
Reply to author
Forward
0 new messages