Hi!
S2 is an extension of Snappy. It can decode Snappy content but S2 content cannot be decoded by Snappy. S2 is aimed for high throughput, so it features concurrent compression for bigger payloads (streams).
Benefits over Snappy:
Better compression
Concurrent stream compression - several GB/sec.
Faster decompression
Ability to quickly skip forward in compressed stream
Compatible with Snappy compressed content
Smaller max block size overhead on incompressible blocks.
An alternative, more efficient, but slightly slower compression mode available.
Block concatenation.
Automatic stream size padding.
Drawbacks over Go Snappy:
Not optimized for 32 bit systems.
No AMD64 assembler compression implementation yet, meaning slightly slower compression speed on 1 core CPU.
Uses slightly more memory due to larger blocks and concurrency (configurable).
This package is aimed at replacing Snappy as a high speed compression package. If you are mainly looking for better compression zstandard gives better compression, but typically at speeds slightly below "better" mode in this package.
Github, with more details and benchmarks: https://github.com/klauspost/compress/tree/master/s2#s2-compression
Excellent! Thank you, Klaus!It sounds like it does, but just to be clear: does S2 also replace/upgrade the snappy *streaming* format (the outer format that is different from snappy itself; e.g. https://github.com/glycerine/go-unsnap-stream )?
Nice work, Klaus!
On Mon, Aug 26, 2019 at 8:29 PM Klaus Post <klau...@gmail.com> wrote:
> Concurrent stream compression - several GB/sec.
> Faster decompression
A number of modern compression formats and implementations allow for
concurrent *compression*.
Coincidentally, I've been working on a format that allows for
concurrent *decompression*.
On Wed, Aug 28, 2019 at 7:11 PM Klaus Post <klau...@gmail.com> wrote:
> TLDR; LZ4 is typically between the default and "better" mode of s2.
Nice!
Just a suggestion: rename "better" to either "betterSize / smaller"
(i.e. better compression ratio, worse throughput) or "betterSpeed /
faster", otherwise it's not immediately obvious on which axis "better"
is better in. (Or if it's better in both axes, why not just always
turn it on).
Also, from https://github.com/klauspost/compress/tree/master/s2#format-extensions:
> Format Extensions
> ...
> Framed compressed blocks can be up to 4MB (up from 64KB).
Do you know how much of the size or speed gains come from bumping 64K
to 4M? Put another way, do you still see good size/speed gains if you
do everything else other than bump this number? One of the original
design goals was to limit the amount of memory needed for
decompressing an arbitrary snappy stream.
https://github.com/google/snappy/blob/master/framing_format.txt says:
"the uncompressed data in a chunk must be no longer than 65536 bytes.
This allows consumers to easily use small fixed-size buffers".