SSL Errors and AH becoming unresponsive - anyone else encounter this?

49 views
Skip to first unread message

Ramsay Brown

unread,
Dec 11, 2014, 4:08:28 PM12/11/14
to action...@googlegroups.com
Actionhero occasional becomes functionally unresponsive without providing any description of whatever error is bringing it down.

The [.node.bin] process remains running and listening to the proper port, but a cURL call against it reports an unknown SSL error in which actionhero fails the SSL handshake Server Hello interaction.

This seems to happen randomly and I've observed the AH server to remain online (ie: the process is running) but unresponsive (not processing calls) for up to 12 more hours (before I forced it offline.) Our environment is a 64-bit Ubuntu 12.04.3 running on AWS EC2.

Anyone else encountered this bug?

Rams@Dopamine

Evan Tahler

unread,
Dec 11, 2014, 9:16:53 PM12/11/14
to action...@googlegroups.com
Some other folks have reported this, but we were never able to come up with anything repeatable.  Do you have tasks running which are stuck?  Can you check how much RAM the process uses when this happens?  Are you deployed on a PAAS like heroku that might suspend your process?  Which OS are you on?

I don't think this has to do with SSL/HTTPs.  That message really just means that something was wrong with the HTTPS handshake, including first-byte never received.  I'm going to guess this behavior would be the same with HTTP-only. 

Ramsay Brown

unread,
Dec 12, 2014, 3:38:07 PM12/12/14
to Evan Tahler, action...@googlegroups.com
Hi Evan,

Thanks - it's really upsetting because we have no way of knowing when it happens unless we check it constantly...

*We're running AH on a 64-bit Ubuntu 12.04.3 instance on AWS EC2.

*We're now trying to turn off all possible tasks to make sure it's not hanging on a task.

*Since we don't know when it happens it's hard to get a snapshot of RAM usage.

We'll keep you posted!

r

--
You received this message because you are subscribed to a topic in the Google Groups "actionHero.js" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/actionhero-js/8uvlY2X02_w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to actionhero-js+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Evan Tahler

unread,
Dec 13, 2014, 8:32:39 PM12/13/14
to action...@googlegroups.com, evant...@gmail.com
In my production apps, I do a few things to help with this kind of debugging (Some basic DevOps Tips):

- Always be monitoring the process (ram and CPU).  Tools like newrelic, datadog, m/monit, or even nagois can do this for you.  They sample the application fairly frequently and send that data off to a central server for you to monitor.  You can then set up alerts if the process is using too much of a given resource, hopefully *before* the application becomes responsive
- Set up health checks that poll a status API.  Setup a monitoring tool like pingdom or m/monit to poll an endpoint that checks the status of all underlying application sections.  You should check for DB connectivity, Redis connectivity, and the whole thing should respond under (N)ms.  If this goes wrong, send an alert!

The correlation of these 2 things should at least give you the information you need to know if this is a memory, task, or framework issue.  If you have a small number of nodes, you can also try dumping heap-dumps to disk to check out when something goes wrong. 

I always keep a the master branch up @ http://demo.actionherojs.com to see if anything goes wrong, but I haven't seen a problem like this myself. 

Evan Tahler

unread,
Dec 13, 2014, 8:33:50 PM12/13/14
to action...@googlegroups.com, evant...@gmail.com
Good luck with the hunt! 

Evan Tahler

unread,
Dec 29, 2014, 12:57:27 AM12/29/14
to action...@googlegroups.com, evant...@gmail.com
Were you able to find the culprit?

Ramsay Brown

unread,
Dec 29, 2014, 12:01:27 PM12/29/14
to Evan Tahler, action...@googlegroups.com

Not yet but we're hoping to have some good telemetry and memory monitoring installed soon so we can nab it


To unsubscribe from this group and all its topics, send an email to actionhero-j...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages