I've been studying valgrind and I've got it running against node.
http://www.cprogramming.com/debugging/valgrind.html
I think it can be more helpful, but it doesn't seem to be able to
track the names and line numbers of code run through node addons. I'm
running my tests through valgrind like this.
$> valgrind -v --tool=memcheck --leak-check=full --show-reachable=yes
node_g xmltest.js
Here's a gist of xmltest.js - http://gist.github.com/361906
And more info about it at -
http://groups.google.com/group/libxmljs/browse_thread/thread/f99d106896545f31#msg_fd0b72264d1c22c7
I've also tried different combinations of flags like "--track-
children=yes". Anyway, I've had limited success and I'm hoping
someone on this list can shed some light. Here are my issues/
questions.
- It makes sense to run node_g instead of node proper correct?
Because this allows an external program to instrument the runtime?
- The problem with using node_g is that my tests actually fail and
segfault when using it. They run to completion when run with node.
What's the deal with node_g? Are there known issues caused by
differences here?
- When valgrind does run through the test it does report invalid reads
and bad blocks. However, it can't seem to give me any actual useful
info. Instead I get lots of ?s:
==1037== Invalid read of size 4
==1037== at 0x2DD7887: ???
==1037== by 0x2E19430: ???
==1037== by 0x2E19264: ???
==1037== by 0x2D54252: ???
==1037== by 0x2E15133: ???
==1037== by 0x2D4B69E: ???
==1037== by 0x2E1AEDE: ???
==1037== by 0x2E1BC74: ???
==1037== by 0x2D55E34: ???
==1037== by 0x2DD0DED: ???
==1037== by 0x2DD13FE: ???
==1037== by 0x2DE0F13: ???
==1037== Address 0xbaddeac is not stack'd, malloc'd or (recently)
free'd
==1037==
==1037==
==1037== Process terminating with default action of signal 11
(SIGSEGV)
==1037== Access not within mapped region at address 0xBADDEAC
==1037== at 0x2DD7887: ???
==1037== by 0x2E19430: ???
==1037== by 0x2E19264: ???
==1037== by 0x2D54252: ???
==1037== by 0x2E15133: ???
==1037== by 0x2D4B69E: ???
==1037== by 0x2E1AEDE: ???
==1037== by 0x2E1BC74: ???
==1037== by 0x2D55E34: ???
==1037== by 0x2DD0DED: ???
==1037== by 0x2DD13FE: ???
==1037== by 0x2DE0F13: ???
My thought was that this is because libxmljs.node is a dynamic library
loaded after valgrind and so is not instrumented. Is this reasonable?
Is there anything I can do about this?
Any help is appreciated. FYI, I've tried node_debug and running
node_g through gdb. I get no useful backtraces out of either of
these. If I'm on the wrong road or someone can recommend other useful
tools, I'm all ears.
Thanks
:Marco
Yes
> - The problem with using node_g is that my tests actually fail and
> segfault when using it. They run to completion when run with node.
> What's the deal with node_g? Are there known issues caused by
> differences here?
node compiles out all of the sanity assertions made in V8 and Node in
order to run faster. Node_g keeps them. A segfault in node_g is
indication of a problem. Explore those segfaults in gdb and with
valgrind. I'd be happy to take a look a the code if you pointed out to
me the situation and where to look.
I'm not sure. I'll get back to you. I think Orlando was having a
similar problem with node-sqlite.
here's a quick script you can use to test it out:
var sys = require('sys'),
fs = require('fs'),
libxml = require('./libxmljs');
fs.readFile("test.xml", function (err, data) {
if (err) throw err;
var count = 0;
(function() {
var doc = libxml.parseXmlString(data);
count++;
if((count % 100) == 0) {
sys.puts("Count:" + count);
setTimeout(arguments.callee, 100);
}
else {
setTimeout(arguments.callee, 1);
}
})();
});
no matter what delays i put in there it never seems to release any
memory and eventually gets killed when it runs out of memory. could it
be that the libxmljs objects are not being marked as out of scope (ie
their ref count is not getting reduced to 0 or however it works)?
// We need to notify V8 when we're idle so that it can run the garbage
// collector. The interface to this is V8::IdleNotification(). It
returns
// true if the heap hasn't be fully compacted, and needs to be run
again.
// Returning false means that it doesn't have anymore work to do.
//
// We try to wait for a period of GC_INTERVAL (2 seconds) of idleness,
where
// idleness means that no libev watchers have been executed. Since
// everything in node uses libev watchers, this is a pretty good
measure of
// idleness. This is done with gc_check, which records the timestamp
// last_active on every tick of the event loop, and with gc_timer
which
// executes every few seconds to measure if
// last_active + GC_INTERVAL < ev_now()
// If we do find a period of idleness, then we start the gc_idle timer
which
// will very repaidly call IdleNotification until the heap is fully
// compacted.
So it seems that node will wait for 2 seconds of idleness before it
invokes the GC. I've changed the sleep interval on the example above
to 4000 milliseconds and everything works great.
I'd like to suggest a change to node.js so we can tweak this timeout
as it's way too long for an application that might be constantly busy
with very little idle time. Either that or expose full control over
when the GC runs. I presume it could be implemented as an addon if
there are concerns about having it in the default libraries...
http://github.com/billywhizz/node-gc
let me know if i'm doing anything stupid or need to put more
protection around the call to V8::IdleNotification.
Is there a reason node doesn't immediately start calling
IdleNotification when the event loop is empty? Then keep calling it
until either it returns false or something is added to the event loop?
I assume it runs quickly which is why it needs to be called multiple
times, I'm not sure why its a good assumption that if node has been
idle for 2 seconds it will be idle longer, and with busy scripts that
will obviously never happen.
Connor
but i don't know why it was decided to implement the 2 second rule.
this is the code from node_cc:
static void CheckIdleness(EV_P_ ev_timer *watcher, int revents) {
assert(watcher == &gc_timer);
assert(revents == EV_TIMER);
//fprintf(stderr, "check idle\n");
ev_tstamp idle_time = ev_now(EV_DEFAULT_UC) - last_active;
if (idle_time > GC_INTERVAL) {
if (needs_gc) {
needs_gc = false;
if (!V8::IdleNotification()) {
ev_idle_start(EV_DEFAULT_UC_ &gc_idle);
}
}
// reset the timer
gc_timer.repeat = GC_INTERVAL;
ev_timer_again(EV_DEFAULT_UC_ watcher);
}
}
that looks to me like the GC will never get a chance to run unless
there have been 2 seconds without any events being handled. can that
be correct??
Here's my libxmljs repo and a gist with the simplest code that works
in node but throws a segfault in node_g. If you have time to take a
look that would be great.
http://gist.github.com/363158
http://github.com/polotek/libxmljs
I haven't tried it, but scanning the code I've spotted some errors
http://github.com/polotek/libxmljs/blob/61d1a5b6a601c0d4bfa790c13d87823a0d6438d9/src/xml_node.cc#L18
This should be "return scope.Close(node->get_doc())"
http://github.com/polotek/libxmljs/blob/61d1a5b6a601c0d4bfa790c13d87823a0d6438d9/src/xml_node.cc#L75
http://github.com/polotek/libxmljs/blob/61d1a5b6a601c0d4bfa790c13d87823a0d6438d9/src/xml_node.cc#L84
http://github.com/polotek/libxmljs/blob/61d1a5b6a601c0d4bfa790c13d87823a0d6438d9/src/xml_node.cc#L93
[...]
Similar problem
http://github.com/polotek/libxmljs/blob/61d1a5b6a601c0d4bfa790c13d87823a0d6438d9/src/xml_node.cc#L93
Bad C-style. #include "xml_node.h"
http://github.com/polotek/libxmljs/blob/61d1a5b6a601c0d4bfa790c13d87823a0d6438d9/src/xml_namespace.cc#L22-35
Not an error, but there is no need to heap allocate these.
I think your problem is likely not closing the scope, since this is
not done anywhere in the code.
--
You received this message because you are subscribed to the Google Groups "nodejs" group.
To post to this group, send email to nod...@googlegroups.com.
To unsubscribe from this group, send email to nodejs+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nodejs?hl=en.
Yes.
> So we will need to find ALL of the problems here before we
> can get a simple working test case?
Probably - but it shouldn't be hard to go through and wrap all those
return statements in scope.Close();