[ANN] Pack: Interfaces for LZ77-based data compression

72 views
Skip to first unread message

Andy Balholm

unread,
Sep 20, 2021, 6:47:10 PM9/20/21
to golang-nuts
Many data-compression schemes are conceptually composed of two steps:
LZ77 (finding repeated sequences) and entropy encoding. But I've never
found a compression library that treats those steps as separate
components. So I made one: github.com/andybalholm/pack. It defines
interfaces for the two stages, and includes a few example
implementations (Encoders for snappy, flate, and brotli; and
MatchFinders based on snappy and flate.BestSpeed).

Of course the abstraction of using these interfaces has a performance
penalty, but my flate.BestSpeed implementation is still faster than the
standard library flate package.

The biggest disadvantage of dividing things up like this is that it
prevents taking advantage of unique features of a compression format,
like brotli's static dictionary, or feedback from the entropy encoder to
the LZ77 stage like brotli and zopfli use at higher compression levels.
But by breaking the problem up into smaller pieces, it makes it much
easier to experiment with compression algorithms.

Andy

Reply all
Reply to author
Forward
0 new messages