That would probably be a good idea.. It seems like an algorithmic
complexity thing somewhere in the reflection system (which,
apparently, is where ALL the code for the python version is). It
behaves just fine for small files, which makes me think that there's
something like O(N^2)-type algorithm somewhere where the other
versions use an O(N) algorithm. I don't believe the slowness is
inherent to Python. I'm trying to compare the versions to find out
the core algorithms used, but it's somewhat difficult because C++
dosn't have the extensive built-in reflection that java does, which,
in turn, isn't anywhere near as dynamic as Python, so it seems that
the core algorithms actually differ per implementation.
That said, yes, it would be quite useful to have a set of benchmarks,
and it would also be good to have some developers notes to outline the
chosen algorithms and their expected runtimes, per implementation.
Both of these would help greatly for developing new language
interfaces. I can draft up some proto's to use as tests, but my
knowledge of protobuffers is still somewhat limited, of course :-).
Robbie