data: /var/www/XXX/bundle/main.js:13190 - error: Forever detected script was killed by signal: SIGKILL
data: /var/www/XXX/bundle/main.js:13190 - error: Forever restarting script for 15 time
--
You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.
For more options, visit https://groups.google.com/groups/opt_out.
node invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Jan 22 10:41:47 ip-10-31-196-68 kernel: [493895.113634] node invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Jan 22 10:41:47 ip-10-31-196-68 kernel: [493895.113738] [<ffffffff81106acd>] oom_kill_process.part.4.constprop.5+0x13d/0x280
Jan 22 10:44:12 ip-10-31-196-68 kernel: [494039.841765] nginx invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Of course we were getting very strange errors -- sometimes a bad gateway if oom killed nginx; sometimes just a faulty node process (that forever would restart). Out of memory explained everything, and since upgrading we haven't seen a single problem in the forever logs.
I think you're right about node going above 1GB -- do you add any params to forever/node when you started it? Or does it just grow to the capacity it needs? I suppose I can do more research there. In our case, we had eight node proceses sharing 7 GB of ram. Add to that mix how we are loading up the client initially with most of their data, and there is a healthy amount of data being cached.
Now, we are running 4 processes sharing 15 GB of RAM. We haven't seen a spike in usage that would cause the RAM to spill over 1GB per process, but I'm sure we will, possibly next week.
Thanks for the tips for the swap file and the 'fields' specifier -- we have done of the field specifying, but have not looked into the swap file business at all. As far as CPU utlization goes, oplog tailing has made a dramatic difference -- our webservers are down from 60-80% consistent utilization to between 5-15%.