Hourly Crash - Memory Leak - Excessive CPU

576 views
Skip to first unread message

Julian Knight

unread,
Aug 23, 2016, 2:11:41 PM8/23/16
to Node-RED
Since Saturday when I did some updates, Node-Red has been crashing about once an hour.

It appears as though it is running out of memory. I tried reducing things down by uninstalling some unused nodes and removing some unnecessary flows. I even tried turning off some flows that I thought might cause issues (shelling out to run cmd line apps). Although this initially appeared to help, the memory fairly rapidly builds up again and then more crashes.

I've also noticed a really big jump in CPU usage with top reporting the Node-Red process running at >80% continuously.

Has anyone else noticed any recent problems?

Also, is there any way to track down what might be leaking memory? Or using excessive CPU?

Here is an example from the log when it crashes. Notice that it seems as though the shell out to run some commands seems to be causing the eventual crash. However, it is clear that this is just because NR has already used up virtually all of the RAM and this just pushes it over the edge.

23 Aug 06:14:24 - [info] [exec:ARP Scan (sudo)] sudo arp-scan --localnet --macfile=/home/pi/node/nr-live2/mac-vendor-jk.txt
23 Aug 06:14:24 - [error] [exec:ARP Scan (sudo)] Error: spawn ENOMEM
23 Aug 06:14:25 - [info] [debug:ERROR:H/ware In]
{ topic: '',
payload: 1471929264649,
_msgid: 'e1413cf5.1ebec',
error:
{ message: 'Error: spawn ENOMEM',
source:
{ id: 'aa8e0847.ed13d8',
type: 'exec',
name: 'ARP Scan (sudo)',
count: 1 },
stack: 'Error: spawn ENOMEM\n    at exports._errnoException (util.js:907:11)\n    at ChildProcess.spawn (internal/child_process.js:298:11)\n    at exports.spawn (child_process.js:362:9)\n    at Object.exports.execFile (child_process.js:151:15)\n    at exports.exec (child_process.js:111:18)\n    at ExecNode.<anonymous> (/home/pi/node/nr-live2/node_modules/node-red/nodes/core/core/75-exec.js:95:25)\n    at emitOne (events.js:77:13)\n    at ExecNode.emit (events.js:169:7)\n    at ExecNode.Node.receive (/home/pi/node/nr-live2/node_modules/node-red/red/runtime/nodes/Node.js:211:14)\n    at InjectNode.Node.send (/home/pi/node/nr-live2/node_modules/node-red/red/runtime/nodes/Node.js:133:22)\n    at InjectNode.<anonymous> (/home/pi/node/nr-live2/node_modules/node-red/nodes/core/core/20-inject.js:64:22)\n    at emitOne (events.js:77:13)\n    at InjectNode.emit (events.js:169:7)\n    at null._repeat (/home/pi/node/nr-live2/node_modules/node-red/nodes/core/core/20-inject.js:37:22)\n    at wrapper [as _onTimeout] (timers.js:275:11)\n    at Timer.listOnTimeout (timers.js:92:15)' } }
23 Aug 06:14:26 - [info] [exec:Check if JK's Phone2 is near (l2ping)] sudo l2ping -c 1 04:F7:E4:CE:2A:8F
23 Aug 06:14:26 - [error] [exec:Check if JK's Phone2 is near (l2ping)] Error: spawn ENOMEM
23 Aug 06:14:26 - [info] [debug:ERROR:H/ware In]
{ topic: 'DEVICES/BT/JK-PHONE2',
payload: 1471929266899,
_msgid: 'a5420f3b.5abdf',
error:
{ message: 'Error: spawn ENOMEM',
source:
{ id: '351354b9.e7913c',
type: 'exec',
name: 'Check if JK\'s Phone2 is near (l2ping)',
count: 1 },
stack: 'Error: spawn ENOMEM\n    at exports._errnoException (util.js:907:11)\n    at ChildProcess.spawn (internal/child_process.js:298:11)\n    at exports.spawn (child_process.js:362:9)\n    at Object.exports.execFile (child_process.js:151:15)\n    at exports.exec (child_process.js:111:18)\n    at ExecNode.<anonymous> (/home/pi/node/nr-live2/node_modules/node-red/nodes/core/core/75-exec.js:95:25)\n    at emitOne (events.js:77:13)\n    at ExecNode.emit (events.js:169:7)\n    at ExecNode.Node.receive (/home/pi/node/nr-live2/node_modules/node-red/red/runtime/nodes/Node.js:211:14)\n    at InjectNode.Node.send (/home/pi/node/nr-live2/node_modules/node-red/red/runtime/nodes/Node.js:133:22)\n    at InjectNode.<anonymous> (/home/pi/node/nr-live2/node_modules/node-red/nodes/core/core/20-inject.js:64:22)\n    at emitOne (events.js:77:13)\n    at InjectNode.emit (events.js:169:7)\n    at null._repeat (/home/pi/node/nr-live2/node_modules/node-red/nodes/core/core/20-inject.js:37:22)\n    at wrapper [as _onTimeout] (timers.js:275:11)\n    at Timer.listOnTimeout (timers.js:92:15)' } }
23 Aug 06:14:26 - [info] [exec:Check if JK's Phone is near (l2ping)] sudo l2ping -c 1 B4:E1:C4:B5:77:E4
23 Aug 06:14:26 - [error] [exec:Check if JK's Phone is near (l2ping)] Error: spawn ENOMEM
23 Aug 06:14:26 - [info] [debug:ERROR:H/ware In]
{ topic: 'DEVICES/BT/JK-PHONE',
payload: 1471929266922,
_msgid: '4c549196.b3ab7',
error:
{ message: 'Error: spawn ENOMEM',
source:
{ id: 'efa26c8b.fcdd2',
type: 'exec',
name: 'Check if JK\'s Phone s near (l2ping)',
count: 1 },
stack: 'Error: spawn ENOMEM\n    at exports._errnoException (util.js:907:11)\n    at ChildProcess.spawn (internal/child_process.js:298:11)\n    at exports.spawn (child_process.js:362:9)\n    at Object.exports.execFile (child_process.js:151:15)\n    at exports.exec (child_process.js:111:18)\n    at ExecNode.<anonymous> (/home/pi/node/nr-live2/node_modules/node-red/nodes/core/core/75-exec.js:95:25)\n    at emitOne (events.js:77:13)\n    at ExecNode.emit (events.js:169:7)\n    at ExecNode.Node.receive (/home/pi/node/nr-live2/node_modules/node-red/red/runtime/nodes/Node.js:211:14)\n    at InjectNode.Node.send (/home/pi/node/nr-live2/node_modules/node-red/red/runtime/nodes/Node.js:133:22)\n    at InjectNode.<anonymous> (/home/pi/node/nr-live2/node_modules/node-red/nodes/core/core/20-inject.js:64:22)\n    at emitOne (events.js:77:13)\n    at InjectNode.emit (events.js:169:7)\n    at null._repeat (/home/pi/node/nr-live2/node_modules/node-red/nodes/core/core/20-inject.js:37:22)\n    at wrapper [as _onTimeout] (timers.js:275:11)\n    at Timer.listOnTimeout (timers.js:92:15)' } }
internal/child_process.js:298
throw errnoException(err, 'spawn');
^
Error: spawn ENOMEM
at exports._errnoException (util.js:907:11)
at ChildProcess.spawn (internal/child_process.js:298:11)
at exports.spawn (child_process.js:362:9)
at null._repeat (/home/pi/node/nr-live2/node_modules/node-red-node-ping/88-ping.js:30:41)
at wrapper [as _onTimeout] (timers.js:275:11)
at Timer.listOnTimeout (timers.js:92:15)
nr-live2.service: main process exited, code=exited, status=1/FAILURE
Unit nr-live2.service entered failed state.
nr-live2.service holdoff time over, scheduling restart.
Stopping Node-Red Live2 Instance for Home Automation...
Starting Node-Red Live2 Instance for Home Automation...
Started Node-Red Live2 Instance for Home Automation.
Express 4 server listening on port 8081, serving node-red
Welcome to Node-RED
===================
23 Aug 06:14:39 - [info] Node-RED version: v0.14.6
23 Aug 06:14:39 - [info] Node.js  version: v4.5.0
23 Aug 06:14:39 - [info] Linux 4.4.16-v7+ arm LE
23 Aug 06:14:39 - [info] Loading palette nodes
pi : TTY=unknown ; PWD=/home/pi/node/nr-live2 ; USER=root ; COMMAND=/usr/bin/python -u /home/pi/node/nr-live2/node_modules/node-red/nodes/core/hardware/nrgpio.py info
pam_unix(sudo:session): session opened for user root by (uid=0)
pam_unix(sudo:session): session closed for user root
23 Aug 06:14:59 - [info] UI started at /ui
23 Aug 06:15:08 - [info] User directory : .nodered
23 Aug 06:15:08 - [info] Flows file     : .nodered/flows_pi2.json
23 Aug 06:15:12 - [info] Starting flows
23 Aug 06:15:12 - [info] [smartplug-device:SP01] Status changed: undefined-->initializing
23 Aug 06:15:13 - [info] [inject:c67f9a9d.f73f58] repeat = 300000
23 Aug 06:15:13 - [info] [inject:ee17c8c9.3af858] repeat = 300000
23 Aug 06:15:13 - [info] [inject:f9b6b21.776805] repeat = 300000
23 Aug 06:15:13 - [info] [inject:5e86a8f6.46cdd8] repeat = 300000
23 Aug 06:15:13 - [info] [inject:F.Hall (03) Off 9am M-F] crontab = 00 9 * * 1,2,3,4
23 Aug 06:15:14 - [info] [inject:Landing (04) Off 9am Mo-Th] crontab = 00 9 * * 1,2,3,4
23 Aug 06:15:14 - [info] [inject:R Hall (02) Off 9am M-F] crontab = 00 9 * * 1,2,3,4
23 Aug 06:15:14 - [info] [inject:Tree (05) On 3pm Mo-Su] crontab = 00 15 * * *
23 Aug 06:15:14 - [info] [inject:Tree (05) On 1:12am Mo-Su] crontab = 12 01 * * *
23 Aug 06:15:14 - [info] [inject:F Hall (03) Off M-Su/1am] crontab = 00 01 * * *
23 Aug 06:15:14 - [info] [inject:R Hall (02) Off M-Su/1am] crontab = 00 01 * * *
23 Aug 06:15:14 - [info] [inject:e2c7e8f8.0e9518] crontab = 03 18 * * *
23 Aug 06:15:14 - [info] [inject:F Hall (03) Off M-Su/1am] crontab = 57 18 * * *
23 Aug 06:15:15 - [info] [inject:Once an Hour] repeat = 3600000
23 Aug 06:15:15 - [info] Started flows
23 Aug 06:15:16 - [info] [exec:ARP Scan (sudo)] sudo arp-scan --localnet --macfile=/home/pi/node/nr-live2/mac-vendor-jk.txt
23 Aug 06:15:16 - [info] [exec:Check if JK's Phone2 is near (l2ping)] sudo l2ping -c 1 04:F7:E4:CE:2A:8F
pi : TTY=unknown ; PWD=/home/pi/node/nr-live2 ; USER=root ; COMMAND=/usr/bin/arp-scan --localnet --macfile=/home/pi/node/nr-live2/mac-vendor-jk.txt
pam_unix(sudo:session): session opened for user root by (uid=0)
23 Aug 06:15:16 - [info] [exec:Check if JK's Phone is near (l2ping)] sudo l2ping -c 1 B4:E1:C4:B5:77:E4
pi : TTY=unknown ; PWD=/home/pi/node/nr-live2 ; USER=root ; COMMAND=/usr/bin/l2ping -c 1 04:F7:E4:CE:2A:8F
pi : TTY=unknown ; PWD=/home/pi/node/nr-live2 ; USER=root ; COMMAND=/usr/bin/l2ping -c 1 B4:E1:C4:B5:77:E4
pam_unix(sudo:session): session opened for user root by (uid=0)
pam_unix(sudo:session): session opened for user root by (uid=0)
23 Aug 06:15:16 - [info] [rfx-lights-in:Listen to RFX: Lights] connecting to /dev/rfxtrx433e
23 Aug 06:15:16 - [info] serial port /dev/ttyAMA0 opened at 9600 baud 8N1
23 Aug 06:15:16 - [error] serial port /dev/arduino error: Error: Cannot open /dev/arduino
23 Aug 06:15:17 - [info] [mqtt-broker:3b2726f7.c4d8da] Connected to broker: Pi2_NR-Live@mqtt://localhost:1883
23 Aug 06:15:17 - [error] [json:8664f9e8.799b08] Unexpected end of input
pam_unix(sudo:session): session closed for user root


Walter Kraembring

unread,
Aug 24, 2016, 2:37:05 AM8/24/16
to Node-RED
Hello Julian,
Actually I updated two RPi's yesterday to the latest Jessie including NR 0.14.6 following instructions here: https://www.raspberrypi.org/blog/another-update-raspbian/

One of them have rather tough tasks (rather complex Node-RED configuration) but still works very well. Node-RED uses approx 29% of available memory and CPU is most of the time around 2-5%, temporary around 30% and peaking at around 47%. The other RPi is calm but also doing very little, approx 21% of available memory, CPU is around 0.3-0.6%

Best regards, Walter

Giles Roadnight

unread,
Aug 24, 2016, 3:22:11 AM8/24/16
to node...@googlegroups.com
Is it exposed to the public internet? When my Pi was exposed to the public internet with Node red on port 80 the poor little Pi died very quickly with the flood of malicious requests that came in.

--
http://nodered.org
 
Join us on Slack to continue the conversation: http://nodered.org/slack
---
You received this message because you are subscribed to the Google Groups "Node-RED" group.
To unsubscribe from this group and stop receiving emails from it, send an email to node-red+u...@googlegroups.com.
To post to this group, send email to node...@googlegroups.com.
Visit this group at https://groups.google.com/group/node-red.
For more options, visit https://groups.google.com/d/optout.

Julian Knight

unread,
Aug 24, 2016, 3:51:17 AM8/24/16
to Node-RED
Thanks Walter. The Pi is fairly heavily used as it has InfluxDB, Grafana & Mosquitto also running but it has never been a problem before. It was a very noticeable jump from about 2pm Sat. which I suspect was when I ran the various OS and NR updates though annoyingly I can't remember what was updated and I made some flow changes just afterwards too since everything seemed to be working OK at the time.

Mark Setrem

unread,
Aug 24, 2016, 3:55:52 AM8/24/16
to Node-RED
Probably worth reverting back to your backup, then doing each upgrading individually to see which one causes the problem

Julian Knight

unread,
Aug 24, 2016, 4:16:44 AM8/24/16
to Node-RED, gi...@roadnight.name
Good point Giles but no, not a chance. 

There is a very roundabout route to one web page but it is on a very obscure port and is proxied over an https connection via my NAS. Network traffic is running at less than 1kbps anyway and most of that will be the 2 SSH and 1 Grafana connections I'm currently using to monitor.

It is the process "/usr/local/bin/node --max-old-space-size=128 server.js" that is causing the issue. It is demonstrating a very rapid increase in memory use, about 75% at the moment (I restarted the Pi about 30min ago). It continues to grow until it reaches a point where the commands that shell out to the OS to run some command line monitors every 5min require too much memory and crash the process.

That much I can tell. What I can't tell of course is what, inside Node-Red, is causing the issue. It must be something I updated on Saturday because I believe I've removed any flow changes that might have caused issues - I'd added a flow that displays the package.json file via the Dashboard UI. I've also disabled the log output to the Dashboard which is the other flow I thought might significantly up the memory use.

Dave C-J

unread,
Aug 24, 2016, 7:16:21 AM8/24/16
to node...@googlegroups.com
the =128 refers to the MB limit at which it will start trying to garbage collect unused objects to try to free up memory. We deliberately set it quite low so it would not try to use all the memory and should leave some for all those other things you guys like to run, like brokers and databases and the base OS of course.

If you have 512 MB or 1GB ram then you could try upping that to =256 or =512 perhaps to see if there is something causing you to keep garbage collecting, then swapping back and so on. But it's fairly surprising unless you are playing with large objects like files, or have a lot of through ongoing, which would be obvious anyway - and shelling out once every 5 mins should be no hassle. I don't think we've intentionally changed anything recently that should make any difference - but...

Julian Knight

unread,
Aug 24, 2016, 7:32:10 AM8/24/16
to Node-RED
I'll try adjusting the memory though it seems clear that it is actually not helping. Looking at the monitor, even now (NR reset on the dot of 12pm), I'm already seeing:

 CPU%  MEM%  VIRT   RES   PID USER        NI S     TIME+   R/s   W/s Command
 76.5  52.5  566M  485M  3248 pi          10 R  19:09.67     0     0 /usr/local/bin/node --max-old-space-size=128 server.js


It keeps going until total memory usage has been knocking along at 98% for a few minutes (with some swap use building up) then it crashes, generally when one of the shell commands runs.

You will also note the excessive CPU utilisation which I've rarely seen in the past.

This is on a Pi2 with 1GB of RAM. The only other thing using a significant amount of RAM is InfluxDB but that is only about 11%.

In the couple of minutes I've been typing this email (and doing some actual work!), the memory is already:

 CPU%  MEM%  VIRT   RES   PID USER        NI S     TIME+   R/s   W/s Command
 83.6  64.2  676M  594M  3248 pi          10 R  24:54.65     0     0 /usr/local/bin/node --max-old-space-size=128 server.js

Julian Knight

unread,
Aug 24, 2016, 7:39:25 AM8/24/16
to Node-RED
Oh and in regard to large data items. No, as said in another reply, I was passing through the last couple of hundred lines of the log to the UI Dashboard but I've disabled that in case it was causing an issue - though I hadn't changed anything on that for a week or so.

I don't believe there is that much else happening. MQTT-Spy is telling me that I am averaging about 3 MQTT messages per second with peaks up to about 9. These are mainly a few sensors transmitting about every 50sec with NR retransmitting details from the input message (hence the peaks). I've shared my flow with you previously and not that much has changed.


On Wednesday, 24 August 2016 12:16:21 UTC+1, Dave C-J wrote:

Dave C-J

unread,
Aug 24, 2016, 8:34:41 AM8/24/16
to node...@googlegroups.com
that's not good :-(....
if you are in the mood... can you try (backup flows first of course) - completely uninstalling the dashboard ? - not just removing nodes - and make sure no hidden config node left lying around...

Julian Knight

unread,
Aug 24, 2016, 12:16:51 PM8/24/16
to Node-RED
OK, I will try that.

Julian Knight

unread,
Aug 24, 2016, 1:03:28 PM8/24/16
to Node-RED
Just trying with a second round of unnecessary nodes removed! Seems that I'd lost track of how many nodes I'd installed to try and then not used or stopped using. Left Dashboard in on this round, will remove it next. It is already looking to be the same as last time round.

Here is what the memory utilisation actually looks like over several hours:

Julian Knight

unread,
Aug 24, 2016, 1:30:35 PM8/24/16
to Node-RED
Right, definitely still happening.

One other oddity. After I've restarted NR and reloaded the admin UI, the deploy button is already lit up which isn't right. Also, I have an empty Dashboard config node which I have now several times deleted and re-deployed (full) but it always comes back! I''ve noticed a number of issues with those nodes actually since when you move things around using the dashboard sidebar leaving empty tabs, the actual groups/cards DON'T always move as instructed! I've not had time to raise an issue as yet but I wonder if the move from UI to Dashboard has made a mess of things.

OK, that's officially annoying. I removed ALL of the flows containing Dashboard nodes. Then I deleted ALL of the Dashboard configuration nodes. Then I did a full deploy. Then I reloaded the browser window. Yup, you guessed it, the unused Dashboard configuration nodes magically reappeared! Also note that deleting the configuration nodes leaves the entries in the Dashboard sidebar and I'm guessing that this is what causes the issue since I suspect that the Dashboard sidebar is then used to reconstitute the empty configuration nodes.

So then I tried deleting them all from the Dashboard sidebar instead. That worked except for the one remaining configuration node that I now realise is the base. Closing the Dashboard sidebar and then deleting the configuration node before deploying and reloading seems to finally have got rid of that. Phew, now to uninstall the Dashboard.

Right, I've removed the Dashboard node from the package.json and run "npm prune" again. So now restarting and we will see if this does the trick. I am still running with "--max-old-space-size=128" to keep things simple.

Julian Knight

unread,
Aug 24, 2016, 1:38:40 PM8/24/16
to Node-RED
Good news and bad news.

Good news: It isn't Dashboard that is causing the issue, I've removed all traces and the memory utilisation is going up at the same rate as before.

Bad news: What the heck is causing this?!

Time for drastic measures. I am going to rename my flow file and run with a new one.

On Wednesday, 24 August 2016 18:30:35 UTC+1, Julian Knight wrote:
Right, definitely still happening.


Julian Knight

unread,
Aug 24, 2016, 5:48:13 PM8/24/16
to Node-RED
Having emptied the config folder of all flow and config files to start from scratch, I now get the following baseline (I kept the same nodes installed as last time):

Without Node-Red running, the Pi2 is using around (avg):
CPU: 18% 
RAM: 29%
 
With Node-Red running an new, empty flow file with the same nodes installed as previously: 
CPU: 18%
RAM: 39% initially, dropping to 36/37% after 3 minutes

Indicating that any background CPU is within the standard variations on the platform and the RAM utilisation is around 8%.

No apparent memory leak there.

Then I completely rebuilt a new project from scratch & re-applied the same flow as before. Guess what - no apparent memory leak either! So now I'm completely flummoxed.

Currently, the main difference between the old and the new projects is that the new runs directly from red.js whereas the old runs from a server.js file (e.g. embedded mode).

I will continue to monitor but it looks like the issue is mysteriously absent at the moment. We will have to see if it returns.

In any case, I have some interesting metrics on how node-red runs. My full home-automation flow seems to occupy only around 20Mb, peaking at around 30 or so before GC kicks in. CPU runs at between 3 and 12 % with my HA flows running but interestingly runs at 0% with no flow at all (implying no background CPU overhead for the admin UI alone)? InfluxDB is a bit of a CPU hog, peaking every few seconds at >100% but it is fairly conservative with memory.

Julian Knight

unread,
Aug 25, 2016, 4:02:33 AM8/25/16
to Node-RED
So it has been stable all night now, memory builds up over time and then is cleared down as expected. Really not sure what happened.


On Wednesday, 24 August 2016 22:48:13 UTC+1, Julian Knight wrote:
Having emptied the config folder of all flow and config files to start from scratch, I now get the following baseline (I kept the same nodes installed as last time):
...
Reply all
Reply to author
Forward
0 new messages