A while ago I posted here asking if there was a zlib-like decoder implementation available, and it turned out that was still in progress. I've since seen a number of such projects, but they're all based on streams of compressed blocks (each of which is decoded all at once), which unfortunately doesn't fit my use case. Anyway, I ended up implementing this myself, and the code is available here:
https://github.com/pdjonov/Lz4Stream.
It's not going to perform nearly as well as the standard implementation, but it's still pretty quick (quick enough to still beat the popular managed zlib implementations) given that it satisfies the following requirements:
- It's fully type-safe and verifiable managed code (no calling out to a C library, no use of C#'s unsafe), making it useful in certain constrained use cases.
- It reads from a single continuous block of raw LZ4-compressed data. You can take the output of LZ4_compress[HC] and feed it directly to the decoder.
- It loads data incrementally from a Stream, using a very small internal buffer, preventing long stalls when reading incrementally from a slow source.
- It is a Stream (though it is read-only and non-seekable), so it easily works with existing .NET code.
- It performs well reading large and small blocks alike, so there's no need to rewrite or buffer for clients which do many small reads.
- It uses just a bit more than 64K of internal state. That's not insignificant, but it's a huge savings when streaming through 10+ MB blocks of solid LZ4 data.
A word of caution: it doesn't strictly validate input. It's impossible to really overflow a buffer in non-unsafe C# code blocks, so this isn't a security issue, but invalid data yields undefined results - you won't always get an exception, sometimes invalid data will be silently produced.
At some point I'll likely be adding a C or C++ implementation as well.
I am, of course, happy to hear any feedback.
Cheers!