Jeff is busy right now. He might be able to tackle these bugs in a
couple of weeks. Meanwhile, Marco are you taking it over? There is a
leak_stop branch which you're probably aware of.
As I said in the node.js ml thread, I consistently see segfaults and
bus errors within a running node program. So I would confirm: it's not
an issue with the REPL.
My usecase is: I am opening an epub file, reading its TOC, getting
some file path data (via xpath), opening and parsing HTML from another
bunch of files, getting more data via xpath... this is when things get
problematic with medium-sized files.
I am able, however, to open a small epub file and process it
correctly.
Thus it has definitely something to do with the amount of processed
xml / xpaths in a node program.
Please let me know what other information / data / code I can provide.
See also last comments
http://github.com/sprsquish/libxmljs/issues#issue/18
Francisco
I think you're right that the segfaults increase greatly with larger
files. I also find problems when using the lib extensively for long
periods. For instance I was trying to run benchmarks and was parsing
and using a very small xml file in a loop. At a certain point the
loop craps out and the program segfaults.
I'm definitely not the best person to be tackling this issue. But I'm
giving it a shot. Thanks for the info and I'll keep you posted on
anything noteworthy.
:Marco
On Apr 6, 7:23 am, francisco treacy <francisco.tre...@gmail.com>
wrote:
http://github.com/polotek/libxmljs
I've got a few updates there you might be interested in. I've got
pull requests in to Jeff. But obviously he's busy :)
:Marco
On Apr 6, 7:23 am, francisco treacy <francisco.tre...@gmail.com>
wrote:
Just checked out and tried your fork (version 61d1a5).
It still behaves exactly as the older libxmljs versions from Jeff.
The html files are about 50kb. This works:
for (...) {
var doc2 = libxml.parseHtmlFile(htmlFile);
var body = doc2.get('//body');
}
However this:
for (...) {
var doc2 = libxml.parseHtmlFile(htmlFile);
var body = doc2.get('//body');
sys.puts(body);
}
...blows (segmentation fault) when I print out (or just simply access)
the body node, after many iterations. It happens consistently after
~20 iterations in the forEach loop.
Hope this helps?
Francisco
2010/4/7 Marco Rogers <marco....@gmail.com>:
> --
> To unsubscribe, reply using "remove me" as the subject.
>
2010/4/9 francisco treacy <francisc...@gmail.com>:
In the beginning of the file I had an unnecessary import of express.
var express = kiwi.require('express');
Commenting that out makes the program run without any trouble.
Hmmm...this goes beyond my understanding :)
Francisco
2010/4/9 francisco treacy <francisc...@gmail.com>:
Getting rid of the express require has helped, however when processing
more than one book at a time (per node process), i *still* get this
kind of errors.
:Marco
On Apr 9, 1:28 pm, francisco treacy <francisco.tre...@gmail.com>
wrote:
> Yet another update:
>
> Getting rid of the express require has helped, however when processing
> more than one book at a time (per node process), i *still* get this
> kind of errors.
>
> Francisco
>
> 2010/4/9 francisco treacy <francisco.tre...@gmail.com>:
>
>
>
> > Update: *very strange* ... looks like it works now.
>
> > In the beginning of the file I had an unnecessary import of express.
>
> > var express = kiwi.require('express');
>
> > Commenting that out makes the program run without any trouble.
> > Hmmm...this goes beyond my understanding :)
>
> > Francisco
>
> > 2010/4/9 francisco treacy <francisco.tre...@gmail.com>:
> >> ps: the body.toString() and body.text() methods cause the segfault.
> >> If I call body.name(), it works just fine.
>
> >> 2010/4/9 francisco treacy <francisco.tre...@gmail.com>:
> >>> Hey Marco,
>
> >>> Just checked out and tried your fork (version 61d1a5).
>
> >>> It still behaves exactly as the older libxmljs versions from Jeff.
>
> >>> The html files are about 50kb. This works:
>
> >>> for (...) {
> >>> var doc2 = libxml.parseHtmlFile(htmlFile);
> >>> var body = doc2.get('//body');
> >>> }
>
> >>> However this:
>
> >>> for (...) {
> >>> var doc2 = libxml.parseHtmlFile(htmlFile);
> >>> var body = doc2.get('//body');
> >>> sys.puts(body);
> >>> }
>
> >>> ...blows (segmentation fault) when I print out (or just simply access)
> >>> the body node, after many iterations. It happens consistently after
> >>> ~20 iterations in the forEach loop.
>
> >>> Hope this helps?
>
> >>> Francisco
>
> >>> 2010/4/7 Marco Rogers <marco.rog...@gmail.com>:
The question is what's different between my set up and yours. Here is
the full gist of what I'm doing.
git://gist.github.com/361906.git
A few notes about this script:
- I've got some code at the top that makes some changes for
compatibility with node 0.1.90. If you're using 0.1.33 it shouldn't
affect anything, but if it does, you can remove this.
- I've got 2 ways of running the test.
- looptest runs the iterations in one loop without pausing or
yielding to the event loop. This is more like your original test.
The memory stays higher, but it still works.
- eventlooptest runs each iteration as a separate callback on the
next round of the event loop. This is more efficient and is really
how this type of stuff should be done. You might notice that the
memory stays lower because the garbage collector has time to run
between iterations. But it's not significantly slower.
- Both functions take a filename, number of iterations and an optional
parameter which is a wait time. At the end of the loop test, it will
wait for the specified number of milliseconds and then print memory
usage again. This just to see if the garbage collector catches up
after the load test.
- Don't run both tests at once. Just comment out either the call to
looptest() or eventlooptest at any given time.
:Marco