I did consider the chunk size N at 32 bits.
If N=16, the compression gain for data=0 words is 16:1, and the
penalty on top of nonzero words is 6%. That's not bad for a simple
decoder.
0.4 file size ratio is pretty good. I like simple and done. I can
explain this to a smart intern in minutes.
I'll run my little Basic program on some more complex FPGA bit
streams, as soon as we make some. The current project has a serial
flash chip per FPGA so doesn't need compression, but a new Pi-based
product line might benefit from compression. I expect the FPGA designs
to be fairly simple or absurdly simple. We will use the T20 chips
because we could get them and bought a bucket full for $10 each.
Clocked slow and not working hard, no PLLs, the 1.25 volt core power
is milliamps. A tiny linear regulator from +5 will work fine.
My guys say that compared to the big-guy FPGAs, the efinix design
suite looks like garage-level engineering. That's a feature, not a
defect. A lot of the suite is in plain-sight Python and there's no
FlexLM horrors to fight.