How to handle MQTT broker disconnects?

1,630 views
Skip to first unread message

Greg EVA

unread,
Jul 17, 2015, 9:35:28 AM7/17/15
to node...@googlegroups.com
Hello NR Group,

I have a problem; have been losing an MQTT client provided data feed.

I looked into it yesterday and identified three different reasons.  Without getting into the details of it, I would like just to be able to catch the error/disconnection/lack of data flow in Node-RED and be able to send an alert of some sort to self.

I just tried doing this with the Catch node, but it didn't catch anything.

I can see the node status changing in the flows... but I want to be able to catch this event in a flow in order to do something with it.  Google and Group search provided no wonderment with magical solutions from the community.

Any suggestions on how to deal with this?

Cheers,

Greg

Dave C-J

unread,
Jul 17, 2015, 11:19:21 AM7/17/15
to node...@googlegroups.com
Hi Greg,

right now this isn't possible... but exposing node status is definitely on the to-do list. 

Hans Jespersen

unread,
Jul 18, 2015, 12:48:14 PM7/18/15
to node...@googlegroups.com
Does the data feed consistently send data on a regular basis? In other words can you make an alerting flow or function that is only triggered by an elapsed period of time passing without any data? If data is received the alerter just resets the timer back to zero knowing the connection is working.

-hans

Greg EVA

unread,
Jul 20, 2015, 4:38:32 AM7/20/15
to node...@googlegroups.com
@Dave - this is dissapointing.  I would have at least imagined that the disconnect would cause an error which I would be able to catch with the catch node.  This could perhaps be a configurable option for future consideration (even in the JS config file).

@Hans - thanks for the suggestion, this was what I was originally thinking of doing as the data should certainly change a few times in say a 5 minute period.  I'm going to branch off a new flow and use this approach.

Hans Jespersen

unread,
Jul 20, 2015, 10:42:34 AM7/20/15
to node...@googlegroups.com
Seems unfortunate that not all of the mqtt features are exposed via the mqtt node or you could have used the Last Will Testament feature to generate an alert upon client disconnect.

Greg EVA

unread,
Jul 20, 2015, 11:18:05 AM7/20/15
to node...@googlegroups.com
Although I cannot comment on what is an is not implemented in Node-RED, it would effectively be ashame to have a partial MQTT client/protocol implemenation as adhering to standards to build IoT applications is important, and Node-RED is heavily built on MQTT.

On the topic of LWT; it would still not resolve the issue, as the disconnection can happen from Node-RED; meaning that there is no broker connection upon which to receive an LWT message should there be a network or server problem for instance.

Dave C-J

unread,
Jul 20, 2015, 11:32:03 AM7/20/15
to node...@googlegroups.com
As previously mentioned.... adding status is on the (short term) to-do list.... as is a re-vamped MQTT client. 

Adrian Brown

unread,
Jul 20, 2015, 8:27:36 PM7/20/15
to node-red
that good news Dave
Bring on the revamp MQTT client and status support :-)

On 21 July 2015 at 01:31, Dave C-J <dce...@gmail.com> wrote:
As previously mentioned.... adding status is on the (short term) to-do list.... as is a re-vamped MQTT client. 

--
http://nodered.org
---
You received this message because you are subscribed to the Google Groups "Node-RED" group.
To unsubscribe from this group and stop receiving emails from it, send an email to node-red+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Regards
Adrian Brown
0477173894

Greg EVA

unread,
Jul 21, 2015, 5:47:22 AM7/21/15
to node...@googlegroups.com
Yo dudes,

I just implemented this watchdog to send an SMS upon watchdog overrun, and on reconnection reestablishment.  There is no better way to say it than "it works sweet".

The premise is simple...

  • the normal incoming MQTT subscription event flow resets the watchdog "last received" timer to now
  • then a periodically scheduled flow checks the difference between now and the last received value
  • the event is forked off to the "notification" flow if it has not already fired notification
  • a text message is sent, and something logged to the console debug window
  • when data starts flowing again, it sends a message saying all it well and how long it was offline

The code...

[{"id":"c6c06478.a6cfa8","type":"twilio-api","sid":"abcdefghijklmnop","from":"+33600000000","name":""},{"id":"aa6a4d65.d4ffc8","type":"function","name":"Reset Watchdog Connection Timer","func":"var now = new Date();\nvar watchdog = Math.round( ( now.valueOf() / 1000 ) - context.global.MQTTMeterDataTime );\n\ncontext.global.MQTTMeterDataTime = now.valueOf() / 1000;\n// node.warn( \"resetting watchdog timer to: \" + now.valueOf() / 1000 );\n\n// if we're here, then comms are OK, so a new alert would be needed in case of failure\nif( context.global.DeadCommsAlertSent )\n{\n    msg.payload = \"HOUSE electric meter RESTORED.  It was offline for a total of \" + Math.round( watchdog / 60 ) + \" minutes.\";\n    context.global.DeadCommsAlertSent = 0;\n    return [ null, msg ];\n}\nreturn msg;","outputs":"2","valid":true,"x":406,"y":452,"z":"ac31dc1.db9ed2","wires":[[],["c006aa3.d2eca58","5100e026.80f04"]]},{"id":"ed174232.d79998","type":"inject","name":"Trigger schedule","topic":"","payload":"","payloadType":"date","repeat":"","crontab":"","once":true,"x":128,"y":479,"z":"ac31dc1.db9ed2","wires":[["369e07a1.826a3"]]},{"id":"369e07a1.826a3","type":"later","name":"Scheduled Watchdog Check","schedule":"every 1 minutes","x":392,"y":502,"z":"ac31dc1.db9ed2","wires":[["34784e2f.49b5ea"]]},{"id":"34784e2f.49b5ea","type":"function","name":"Evaluate Timer","func":"var now = new Date();\nvar watchdog = Math.round( ( now.valueOf() / 1000 ) - context.global.MQTTMeterDataTime );\n\nif( watchdog < ( 60 * 2 ) )\n{\n    node.warn(\"watchdog OK (\" + watchdog + \" sec.)\");\n    return [ msg, null ];\n} else {\n    node.warn(\"watchdog TOO BIG (\" + watchdog + \" sec.)\");\n    if( !context.global.DeadCommsAlertSent )\n    {\n        context.global.DeadCommsAlertSent = 1;\n        msg.payload = \"HOUSE electric meter CRITICAL alert: no data received for \" + Math.round( watchdog / 60 ) + \" minutes.  MQTT connection problem to Node-RED.\";\n        // the following will send a notification flow event\n        return [ null, msg ];\n    }\n}\n\nreturn 0;","outputs":"2","valid":true,"x":631,"y":501,"z":"ac31dc1.db9ed2","wires":[[],["5100e026.80f04","c006aa3.d2eca58"]]},{"id":"5100e026.80f04","type":"debug","name":"NOTIFICATION ACTION","active":true,"console":"false","complete":"true","x":906,"y":507,"z":"ac31dc1.db9ed2","wires":[]},{"id":"c006aa3.d2eca58","type":"twilio out","service":"_ext_","twilio":"c6c06478.a6cfa8","from":"","number":"+33613139538","name":"Send SMS","x":867,"y":457,"z":"ac31dc1.db9ed2","wires":[]}]

And a picture...



MQTT_Connection_Watchdog.png

Dave C-J

unread,
Jul 21, 2015, 3:06:19 PM7/21/15
to node...@googlegroups.com
Greg,

nice work. There's always a way :-)
It may or may not be useful in this instance, but the trigger node can be configured to be a watchdog (hold off) trigger, or also as a single shot until reset.

Greg EVA

unread,
Jul 22, 2015, 3:25:06 AM7/22/15
to node...@googlegroups.com

There sure is Dave.  That is certainly an awesome part of Node-RED is you can come up with all sorts of solutions relatively easily.

Wasn't aware of the trigger node functionality, but having just read the doc, it probably would have done the job for me.  Until I added the part which sends a message when the thing comes back online (needs a global context value which could not be accessed outside of a function node).

Walter Kraembring

unread,
Jul 22, 2015, 12:05:40 PM7/22/15
to Node-RED
I have a number of MQTT brokers running in various systems being monitored the simple way using trigger nodes (see picture below) and the Push Bullet node. Most of the time, I do expect a lost connection will require manual intervention anyway. but it would be a nice feature if the trigger node also could handle the situation ' timer triggered earlier but new message arrived since then' and could return a dedicated message for this (or similar somehow)

Best regards, Walter


Dave C-J

unread,
Jul 22, 2015, 1:58:59 PM7/22/15
to node...@googlegroups.com

Hi Walter,
Not quite clear what you mean by another message. Is it different somehow ? Could it use its own trigger node in parallel ? Or could you use it to  reset the trigger. Or.... Not really sure of the use case you are trying to describe.

Walter Kraembring

unread,
Jul 23, 2015, 1:18:18 AM7/23/15
to Node-RED, ge...@ge-volution.eu
Sorry, was not good explained, agree

What I meant is the following use case based on using Trigger nodes (see picture below as example)

1) The Trigger node is configured typically as in picture below

2) Messages are normally arriving within the stipulated time out period (60s in this case) and the delay is extended

3) Messages stops arriving and after the delay, a message is sent 'verisure service lost connection' and further pushed to your smart phone using Push Bullet in this case

4) When connection is repaired and messages starts to arrive again, it would be valuable if the Trigger node could identify and inform about this with another message, 'verisure service connection restored'

If the node configuration would have another check box like 'Send this message on recovery...' and an input field for the message, I think this should be useful

The exact wording need some improvements but I hope this is better explained

Kind regards,
Walter
  
Skärmklipp.PNG

Dave C-J

unread,
Jul 23, 2015, 4:03:20 AM7/23/15
to node...@googlegroups.com

Why not just make it also send a first message (rather than nothing), eg.  "Connected"
That would happen once.... Then not again because of the extends... Then when lost would send the second message.  If it starts again it would then send the first (connected) message...

Greg EVA

unread,
Jul 23, 2015, 4:05:46 AM7/23/15
to node...@googlegroups.com
Sounds interesting.... this could be a much simpler approach to what I put together.  Thanks for the suggestion Dave.

Walter Kraembring

unread,
Jul 23, 2015, 6:52:53 AM7/23/15
to Node-RED, ge...@ge-volution.eu
If it starts again it would then send the first (connected) message...

I need to test this...are you sure it will start all over again once connection is lost and then connected again? Without forcing a deploy? 

Walter Kraembring

unread,
Jul 23, 2015, 7:04:26 AM7/23/15
to Node-RED, ge...@ge-volution.eu
I need to test this...

Just tested, it seems to work fine!!! 

Dave C-J

unread,
Jul 23, 2015, 12:35:40 PM7/23/15
to node...@googlegroups.com

Kerching ! +1 :-)

Julian Knight

unread,
Jul 31, 2015, 11:32:14 AM7/31/15
to Node-RED, ge...@ge-volution.eu
Late to the party but I had a similar problem with connections to MongoDB which runs on my NAS and doesn't restart after an OS update so I have the following:


Though I really should tweak it because receiving 400+ Pushbullet messages to my desktop does kill the PB client! Obviously this only works because the TCP connection throws an error when the destination doesn't exist. Ideally, it should pass through something more sensible which we could then test for.

Reply all
Reply to author
Forward
0 new messages