We have 10 node.js http servers running on v0.10.12. The servers basically act as a gateway where they pipe response of a http client request they make, to the original http request.
The code caches around 500k-600k 10 digit integer keys in a dictionary for around 24 hours before releasing it. (notice the 2nd line of memoryUsage where it frees that). However, after that point, the heapTotal starts growing unboundedly even though headUsed does not grow, until the program crashes. This has happened 5 days in a row but not on all servers. And its pretty random about which server it happens to.
Linux version 2.6.32-279.14.1.el6.x86_64 (mock...@c6b8.bsys.dev.centos.org)
Running the process with parameter --max_old_space_size=1700 or without did not change anything.
Here's the code snippet that caches the data. We create random ids for mysql row so that the request is non-blocking, and we cache it to avoid collisions. Also, we move to a new table everyday, so at midnight every day the entire cache is deleted.
//add to cache code
var generateUniqueRandom = function() {
var recentRequestIdsForTable = recentRequestIds[table_name]; // recentRequestIds is a dictionary and recentRequestIdsForTable is also dictionary keyed by table_name. The most filled recentRequestIdsForTable is for the current day table and it goes up to 600k, expected to reach 1 million per day soon.
var attempts = 0;
var random;
while(attempts < 5) {
random = Math.floor(Math.random() * (9999999999 - 1000000000) + 1000000000); // 10 digit random id
if (!(random in recentRequestIdsForTable)) {
recentRequestIdsForTable[random] = true;
return callback(null, table_name + ":" + random);
}
attempts++;
}
return callback("Could not generate requestID", null);
};
//delete from cache code
exports.clearRecentRequestIds = function(table_name) {
if (table_name in recentRequestIds) {
delete recentRequestIds[table_name]; // drop the entire cached dictionary for the table_name.
logger.info('cleared all recentRequestIds for table ' + table_name);
return;
}
logger.warn(table_name + ' not found in recentRequestIds');
}
Here's the memory usage. It increased to the max in around 1.5 hours. During that time, the server was functioning normally and had no CPU spikes. It is weird that around 500k keys should mean up to around 5MB-10MB of data if worst case keys are stored as string. However they take up space over 100 MB. (Coz freeing up reduced memory from 217MB to 76MB). Maybe it has something to do with that?
{"rss":"368.0078125 MB","heapTotal":"291.7858581542969 MB","heapUsed":"217.14698028564453 MB"}
{"rss":"357.16796875 MB","heapTotal":"287.84947204589844 MB","heapUsed":"76.96298217773438 MB"}
{"rss":"475.1796875 MB","heapTotal":"409.88169860839844 MB","heapUsed":"66.50176239013672 MB"}
{"rss":"577.9921875 MB","heapTotal":"514.1995697021484 MB","heapUsed":"83.56008911132812 MB"}
- and eventually to
{"rss":"1463.27734375 MB","heapTotal":"1381.2594375610352 MB","heapUsed":"95.49384307861328 MB"}
{"rss":"1500.05078125 MB","heapTotal":"1413.7357559204102 MB","heapUsed":"93.21699523925781 MB"}
Also, we did experience the issue in 0.10.10 which was busy loop in net module that led to infinite memory increase and CPU spike a couple of times which is why we upgraded to 0.10.12. Maybe this issue is related? Thought this looks more like a v8 issue.
Please let me know if anyone has any idea/hunch of why this could be happening. Not sure how to come up with a scenario to reproduce this, but I think the sudden release of large size of memory leaves the memory allocator in a bad state or something. I could roll back to 0.8.21 and see how that behaves.
Thanks
- Sumit