I've been working on ways to speed up PDoc and/or decrease its memory
usage.
Parsing is a bit slow, but not all that bad. The prototype.js test
fixture parses in about 60 seconds.
(Tobie, is there any way we can show a progress indicator during the
parsing stage? Is the parser aware of its "position" in the file and/
or how much is left to parse?)
Generating, on the other hand, is _very_ slow. For the same test
fixture, it takes ~220 seconds to go from an abstract parse tree to
final HTML.
I had several half-baked ideas to speed this up. Swapping out ERB for
Erubis [1] barely made a dent in the elapsed time. Experimenting with
other Ruby interpreters (simply for comparison purposes) was
frustrating — JRuby needed far more memory to render the HTML output
at a slower rate, and MacRuby isn't yet mature enough to run all the
prerequisites.
I did some quick profiling and discovered that most of the time is
spent in Documentation::Doc, the root node in the syntax tree.
Documentation::Doc#each is especially costly:
def each
elements.first.elements.map { |e| e.elements.last }.each
{ |tag| yield tag }
end
I changed it to look something like this:
def each
els = elements.first.elements
i = 0
while i < els.size
yield els[i].elements.last
i += 1
end
end
That change alone reduced generation time to ~160 seconds. Not
surprising, since the rest of the class is heavily dependent on
Enumerable.
Since it appears that the tree itself isn't modified at all during the
generation phase, I experimented with memoization (caching the result
the first time the method is called, then serving up the cached
version), thinking it would speed things up quite a bit. Didn't seem
to help much at all.
My next idea — currently in progress — is to have a second set of
"leaner" objects that aren't subclasses of
`Treetop::Runtime::SyntaxNode`. These classes would accept a Treetop
node instance in the constructor, but would extract all metadata
without storing a reference to that node, so that all those bulky
Treetop objects can be garbage-collected.
Adding this extra phase would allow us to build a proper hierarchical
tree, rather than a flat list of nodes that acts like a tree. And it
should improve both generation time _and_ memory usage (though _peak_
memory usage would remain the same). I'll let you know how this goes.
Cheers,
Andrew
[1]
http://www.kuwata-lab.com/erubis/