502 Bad Gateway on GAE FLEX for Node JS Express App using WSS

2,072 views
Skip to first unread message

Sai M

unread,
Jun 28, 2019, 11:05:18 AM6/28/19
to Google App Engine
Hello

I have been working on proof of concept  to demonstrate an idea i have been researching on. This idea is based on Google's Dialogflow, TogetherJS and WebRTC.

The Application uses WebSockets to create real time chat app based message communication.
Application is perfectly fine on my local machine but when i deployed it on GAE i am getting "502 Bad Gateway" error repeatedly.

Initially i deployed the app on GAE SE, i got the same error, but after some research i found that WebSockets are not yet supported on GAE SE.  So i have deployed the app on GAE FLEX, but still i am getting the same issue.

I am not even able to load main landing page of the Application and getting into this "502 Bad Gateway".

Request for any help  / guidance where i am going wrong and this is my first Application trying to make it work on GAE.

Thanks in advance

Best
ias

Harmit Rishi (Cloud Platform Support)

unread,
Jul 4, 2019, 10:55:33 AM7/4/19
to Google App Engine
Hello, 

Thank you for using Google Groups!

From what I understand, you are getting a "502: BAD_GATEWAY" error message when you try to deploy your application onto Google App Engine flexible. Additionally, your code base has been working on your local environment and this issue is only seen when trying to implement cloud technology. The good news is that I was able to find some troubleshooting documentation for this type of server error. Although, you may notice the links provided below point to the Cloud Endpoints documentation but the troubleshooting steps pertain to App Engine flexible so it would be worth while to checkout. 

Based on my research, flexible environment may take a few minutes to successfully respond to requests. Typically, when you encounter 5.x.x (502, 503, etc) it is recommended to wait a minute and try the request again. You may find more information about these errors specific to App Engine flexible environment here

However, most of the time the error code 502 with "BAD_GATEWAY" indicates that GAE terminated the application because it ran out of memory. By default, GAE: flex only has 1GB of memory and only 600MB is available for the application container. The following documentation here describes steps on how to troubleshoot this type of error (You will have to most likely investigate your Stackdriver logs). 

I hope this helps!


Sai M

unread,
Jul 5, 2019, 5:21:52 AM7/5/19
to Google App Engine
Hello Harmit

Thanks for your response and i have tried various options and some of them are by referring to the doc links you have shared.  Unfortunately i am still getting the same error but not sure where i am going wrong.  So i thought i will share some more detailed information of what are the options i tried ....

I understand from the documentation the App must listen on 8080 port. So i can confirm that i have certainly modified the App to listen on 8080 port and also please refer to the downloaded logs (attached to this message) reflecting the same.  But still i am getting the same error and in particular i am getting the error status of 502 for the following...

a)  requestURL of "/" which has response size of 552 
b) request URL of "/favicon.ico" which has response size of 552

 i have fixed this error of "/favicon.ico" error and it doesn't come on my local machine, as i have placed a "favicon.ico" file in my app root folder.  But not sure why this comes up only on GAE FLex.

Also one  more thing i noticed is a WARNING in the Stack driver logs such as.... " Warning: connect.session() MemoryStore is not designed for a production environment, as it will leak memory, and will not scale past a single process."

So i thought this might be indirectly causing the issue as i have not setup the store for "express-session", so i have fixed this by using "session-file-store" npm package and now my express-session code looks like below...

app.use(expSession({
  store: new FileStore,
  genid:(req) => {
    return uuid();
  },
  secret: 'keyboard cat',
  resave: false,
  saveUninitialized: true
}))
Again i can confirm that the above modification working perfectly fine on my local machine by creating the session files in its default folder of "/sessions" in application.

Also based on some of the online documentation, i have tried both the following app.yaml configuraitons...

app.yaml configuration 1 -->
# [START gae_flex_quickstart_yaml]
runtime: nodejs
env: flex
manual_scaling:
  instances: 1  
resources:
   memory_gb: 4
# [END gae_flex_quickstart_yaml]

app.yaml configuration 2 -->
# [START gae_flex_quickstart_yaml]
runtime: nodejs
env: flex
manual_scaling:
  instances: 1  
resources:
  cpu: 1
   memory_gb: 0.5
  disk_size_gb: 10

But still no luck and i am still getting the same error.  I think i have tried every possible option including restarting my browser and clearing cache etc  still no luck.

SO i thought i share you the recent stack driver logs (attached to this message) so that you can see what could be wrong that i am missing out. 

I am pretty much done with actual coding of the application and i am eager to demonstrate  the POC to few of my promising business stake holders, but i am just stuck with this "502 bad gateway" error... and i am not sure what is happening.

Thanks once again for your help and cooperation to guide me and i am really hoping that the new information i have shared in this message along with the Stackdriver logs will give bit more insight for you to help resolve this

Thanks in advance

 
# [END gae_flex_quickstart_yaml]




So i thought this may be another
gae_app_module_id_default__logs__2019-07-05T08-58.json

Swathi Rai

unread,
Jul 5, 2019, 6:50:07 AM7/5/19
to google-a...@googlegroups.com
Hello Harmit,

I am receiving the error message as seen in the attachment. Also PFA the detailed log report.
In addition to these, I can see the below two errors in my app engine error reporting:
I have also attached the images for the detailed stack trace of these errors.
The project works perfectly fine in localhost as well as mvn appengine:run.
mvn appengine:deploy also gives build and deployment success message but when I hit the app URL it gives Server Error.

Thanks and Regards,
Swathi Rai.


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/50cc8224-0e8d-45a1-82fd-4111fd6fcc2e%40googlegroups.com.
cloudSQLerror1.png
serverError.png
cloudSQLerror3.png
BeanCreationException.png
cloudSQLerror2.png

Sami Islam

unread,
Jul 13, 2019, 5:20:35 PM7/13/19
to Google App Engine
Hello,

From the logs that you have provided, it seems like it is showing the error is origanting from the nginx server. Error 502's are most often served by a nginx process in each VM instance that sits in front of your application container.  It has many uses including serving static resources according to handler rules defined in your app.yaml and serving 502s when the application container is unresponsive or otherwise crashes.  Good places to look for your logs:

1. Application logs: If your application encounters errors on specific lines of code, it may still be logging other statements effectively so this can help determine where in your handler code the error occurs
2. stdout, stderr: If the runtime in the application container encounters errors, they may be found here.  Accessing locked down libraries for instance or attempting to write to a file-system without write ability may generate errors visible here.
3. nginx.* logs:  If the application container is entirely unresponsive (which could happen if busy, crashed, out of memory, etc.), nginx should still get the requests and generate appropriate logs.  These log entries may indicate what the nginx process sent to the application container and what it received (or likely failed to receive)
4. Cloud HTTP Load Balancer request logs: If the entire instance is unresponsive including its nginx process, the load balancer may respond with failure codes to ensure client requests don't hang forever.  This may happen if the application occupies all of the VM's resources like CPU or memory

You can check HTTP Load Balancer logs inside Stackdriver for App Engine Flex  and try to identify the status details that shows up in your stackdriver logs.I have found some helpful threads 1, 2 and 3 which provides useful information regarding your issue. 

Also, just for completeness and clarity, the error you might see could mean that the nginx proxy on that specific instance of your backend service fails to contact the webserver (aka your app) in that instance (it is all localized to a specific instance). 

Normally these often occur due to nginx's health checks failing, but this can also just happen when nginx does its normal job of proxying incoming requests to your application. Essentially, the connection between nginx and your application closes due to your application not responding to it, or your application completely stops listening to nginx (it listens on port 8080 as seen in the error). 

Therefore, the issue all comes down to your application code and the Node.js runtime. Node.js is a single threaded runtime, this means that your code must be properly coded to take full advantage of Node.js's event loop in order to be asynchronous. By having async code, you are allowing concurrent requests to be executed, which in turn allows your application to always listen and respond to nginx. Once your application becomes asynchronous, it is able to properly live and scale in the cloud.  

Note that you should always perform exponential backoff-retry on the client-side (aka you proxy service) in the case that your backend service becomes too busy and times out. This way, even if you see 5xx responses, your client should always eventually succeed (once your backend has recovered). 

If you have tested turning the nginx health checks off and you still see these 502 errors, then I highly suggest you report this in the Public Issue Tracker. If the issue is resolved when the health checks are off, then you may be able to reduce the thresholds and intervals of the checks to allow your instances more time to respond to nginx. 

- If you do open an issue report, it is recommended to provide your project ID, the type of health checks you are using (legacy or updated), and a stacktrace of the error you are seeing. 

Regarding the seconds inquiry, you might find these threads 1,2  to be useful as it addresses the 'Warning'  message that you have been receiving about Memorystore.

Please note Google Group mainly focuses on general discussions about Google Cloud Products and services. If you need further assistance regarding this ongoing issue you can create an issue as I have mentioned earlier on our Public Issue Tracker thread and additionally post the issue on Stackoverflow to receive further assistance from the community.
Reply all
Reply to author
Forward
0 new messages