502 Bad Gateway usually means that the nginx proxy that is responsible for handling requests for an App Engine Flexible instance has not been able to get in contact with your application and deems it to be unhealthy. This is done via
health checks that nginx performs (you can test this by disabling health checks).
If your application code is completely busy handling requests, then it will not be able to respond to any health checks, and nginx will tell App Engine that the instance is not healthy and it will be turned down with a 502. App Engine will then automatically start up a brand new instance in its place.
By default, the App Engine Flexible Python runtime uses
sync workers. Sync workers force only a single request to be handled by your instance at a time. It is therefore recommended to ensure your application code use
async workers so that it may always respond to health checks if it is indeed healthy.
In terms of handling this on the frontend; you are correct in your assumption to perform
exponential backoff retry. Whenever your frontend client receives a 5XX HTTP response code from your App Engine application, it should retry the exact same request but only after waiting a small delay. This exponential delay is used to give App Engine some time to recover and start a new instance, and will prevent your application from getting overloaded with requests.