Luckily at least, data compression is more objectively measurable.
Things like audio and video compression also involve a lot of subjective
tradeoffs, like how do the artifacts of one codec look or sound relative
to another, and I have noted that it seems my perceptions don't really
agree with others.
Eg: I seem to prefer a loss of dynamic range and an increase in
"sharpness" a lot more readily than a loss of detail and blurring;
whereas many other people would apparently much rather have a blurry
mess than sharp high-contrast edges.
Similarly for audio: I prefer codecs which decay into a "metallic" sound
over ones which introduce a bunch of warbling and whistling at lower
Eg: MP3 below ~ 60 kbps sounds a bit unpleasant with all the whistling
and warbling and similar, whereas 11KHz ADPCM still sounds pretty
reasonable in comparison.
The result of this has partly been a lot of my own codecs using
Color-Cell and VQ style technology, along with ADPCM style audio codecs,
which apparently a lot of other people perceive as looking and sounding
But, yeah, as for data compression, don't have all that much notable.
I had a few compressors, which did pretty good (vs, eg, LZ4 and Zstd) at
being fast on my past computers (K10 and FX), but after switching to a
Ryzen then both LZ4 and Zstd got a pretty significant speed-up.
Getting high compression ratios is, granted, a little harder...
Recently, managed to pull off something which gets slightly better
compression at comparable decode speeds to LZ4 (on my Ryzen), and is
faster to decode than LZ4 on my own ISA (BJX2).
Decode speeds on a 3.7 GHz Ryzen in both cases is a little over 2GB/sec.
I haven't yet gotten around to running tests on ARM.
Compression for many files seems to be part-way between LZ4 and Deflate.
Note that, like LZ4, it is byte oriented with no entropy coding.
BtRP2 (Transposed, LE):
* dddddddd-dlllrrr0 (l=3..10, d=0..511, r=0..7)
* dddddddd-dddddlll-lllrrr01 (l=4..67, d=0..8191)
* dddddddd-dddddddd-dlllllll-llrrr011 (l=4..515, d=0..131071)
* rrrr0111 (Raw Bytes, r=(r+1)*8, 8..128)
* * rrr01111 (Long Match)
* rr011111 (r=1..3 bytes, 0=EOB)
* rrrrrrrr-r0111111 (Long Raw, r=(r+1)*8, 8..4096)
** d: Distance
** l: Match Length
** r: Literal Length
Values are encoded in little-endian order, with tag bits located in the
LSB. Bits will be contiguous within the value, with shift-and-mask being
used to extract individual elements.
Long Match will encode length and distance using variable-length
encodings directly following the initial tag byte.
lllllll0, 4.. 131
dddddddd-ddddddd0, 32K (0..32767)
The compression could be improved a little more in the Long Match case
via a few more conjoined cases, but the gain in compression would be
pretty small relative to the amount of code added (as the first few
cases handle the vast majority of LZ matches).
I had also left out adding 1 to the distance, mostly because in my tests
this had very little effect on compression, but a somewhat more
noticeable effect on decode speed.