Anyone who uses App Engine are technically being scammed - Why? No way to assess actual error rates

128 views
Skip to first unread message

Kaan Soral

unread,
Dec 6, 2018, 2:03:59 PM12/6/18
to Google App Engine
First of all, let me state that I love App Engine, I've been using App Engine since the very early days, and what hurts me the most if the state of dismay that I feel the product is in, and getting the feeling that those working on the product, not loving and caring the product, as the early engineers did


TL;DR: Errors on an internal network layer can happen, yet these issues are not logged in our Console's, they are logged in the network layer, we can't see them, Google can see them, yet, even when they could, they don't propagate these issues to our App Engine logs, even though I reported and requested it months ago, and, in my opinion, it's a critical issue and a major breach of trust

Why the harsh language, and the accusation of `being scammed` - Well, there's no way to assess actual error rates, see the errors, maybe the error rates are 0.5%, way above SLA guarantees, but as App Engine using developers, there's no way for us to know, as errors are being hidden, rather than being reported - so in order to actually catch these errors, you need to implement your own logging for various different usage scenarios

Technically, 2 problems:
1) Request is made, it never hits App Engine, it never gets logged, client sees error, but as the developer, you never see it
2) Request is made, it hits App Engine, the result never hits the requester, the requester sees 500, while the actual operation is a 200

Depending on your architecture, scenario (1) is better, if you are doing critical operations, (2) could require manual intervention

I'm developing http://adventure.land - an mmorpg that uses App Engine for backend, while it's a game, to prevent item duplication and to ensure data singularity, the operations need to be precise, at this point, it's pretty much like a bank infrastructure, If you check our Discord: https://discord.gg/44yUVeU and search for "App Engine" - you can see how torturous this issue is - my entire architecture now assumes a backend that can fail at any time for long durations (which is kind of nice, but, scenario (2) is out of the scope of this system, it requires an entirely different approach, which I'm not sure how to approach, yet)

Holiday season is a very critical time, I'm way behind schedule, I want to release my game, yet, once again, an App Engine issue ruined my time and wasted the day

Please fix this issue, report the errors, and try to prevent scenario (2) as much as possible

Attila-Mihaly Balazs

unread,
Dec 7, 2018, 6:36:54 AM12/7/18
to Google App Engine
Not that I disagree (there are a fair number of "hidden errors" coming from G's infrastructure which G could do better in exposing to us), but in the end this is the the nature of distributed systems. There will be always many possible sources of errors between the clients and servers and one has to do some work to make it more reliable (for example exponential retries), there is just no way around that (well, probably there are some client-side libraries which offer this).

Attila

Kaan Soral

unread,
Dec 21, 2018, 2:01:58 AM12/21/18
to Google App Engine
Just happened again, a mass outage spanning more than an hour, everyone affected on my game, nothing got logged in App Engine logs, I only captured it as I retain client errors, characters died, got jammed, basically the issue just tore everything - I coded everything to be self-healing at this point, but when the issue happens, it causes a mass reporting spree, rightfully so

So I will file an SLA claim at this point, but I have a hunch I'll have to report to report the issue 3-4 times again for just a measle SLA reimbursement

This issue needs to be fixed

dang...@google.com

unread,
Jan 3, 2019, 3:59:58 PM1/3/19
to Google App Engine
Hello Kaan,

Please, open a ticket on the Issue tracker, since it is a place where you should get better assistance regarding bugs an technical issues[1].

Reply all
Reply to author
Forward
0 new messages