* LZF
* QuickLZ
* Gzip/deflate (JDK one, JCraft)
* Bzip2 (from commons codec)
it is important to get various sets of test data available. Luckily it
seems there's plenty to choose from, so I have included following sets
under testdata/:
- Calgary and Canterbury corpus (sort of well-established benchmarks)
- maximumcompression test set
- test files from http://quicklz.com/bench.html
and will try to get them run on my minimac, just to get a baseline. If
anyone else has time & interest to run these, that would also help.
I am planning to add resulting html pages within github, and link from
project wiki, once I get admin access to project, or someone enables
"gh-pages" (which is the easyish way to add stuff).
Beyond this, it'd be great to get more codecs; either native java
ones, or if we must, JNI-wrapper based ones. I don't want to include
anything that would require using shell to run, but JNI is sort of
acceptable.
So far results are interesting: many codecs are fast, and range is
huge. The only really slow one (relatively speaking) is bzip2; I am
tempted to check out if there's anything that can be done to improve
it: I know algorithm is not designed for speed, but it seems like
there might be room for improvement.
On the other hand, I am pretty happy with speed of the fastest codec,
LZF; its compression speed is particularly impressive. But even
JDK/gzip is plenty fast when used the right away (it uses native codec
I think).
-+ Tatu +-