I described the idea of string-insertion for binary trees in the tornado 0.4 topic.
OK, will carefully read this topic. ;)
Is this related to hash chaining?
Yes, I'm just thinking, how we can translate the real offsets into match indexes. In other words, we may use pure LZ77 with MINMATCH=4 (as BALZ v1.04) and then drop first byte of a match as a literal and encode match length of three, plus, an offset as index. Generally speaking, LZPM does it exactly in such way, but for offset->index translation LZPM keeps byte counts - i.e. cnt[c] - keeps count of c's. Thus, match index = cnt[c]-node.cnt;
:)
Although, you may test the latest BALZ, it's not that worser than LZPM, in most cases it's even better, especially on binary data. So, now I'm thinking about further improving LZ77 scheme especially in offset coding, maybe better buffered offsets technique, etc.