It's not outputting the model, it's outputting the symbol range,
and then the specific symbol within that range. In other words, one
number is encoded as two symbols. That's because the model is so skewed
that you'd run into accuracy problems otherwise.
Isn't there a link to some paper describing the algo and model ?
)>From my understanding, in an adaptive arithmetic coder, the model need
) not be explicitely stated in the stream correct? Couldnt the
) decompressor keep track of it?
)
) PS: I am asking because I am getting poorer results compared to just
) bzip2 which uses a huffman coder in the last stage.
That's probably because the model is poorer than the one in bzip2.
I say probably because I can't make head or tail from the bzip2 code,
and I can't really remember the bwc code all that well.
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
Well, the probability of a '0x01' is a lot of orders of magnitude
greater than the probability of a '0xf0'.
A normal arith coder with 32 bits of accuracy would run into problems
with such a large difference. I think.
) I couldnt find a link to the paper in the tar ball I have... I would
) appreciate it if you could direct me at it.
I was mistaken; the link is in the bzip2 tarball, the one by Peter Fenwick.
ftp://ftp.cs.auckland.ac.nz/pub/staff/peter-f/ACSC96paper.ps
) PS: bzip2 uses an adaptive huffman, starting with all symbols equi
) probobal and updates the model on the fly in both the encoder and
) decoder
Isn't there some model switching code in there ?
Anyway, the only real advantage bwc still has over bzip2 is the ability
to use much larger blocksizes. I haven't touched it in years.
Not that easy, because the mtf code is hacked to match the model.
) I am assuming I need to start the modification in mtf_send_block and
) mtf_get_block. But I cant figure out the implications on the
) model_symbol and subsequent functions if I make this change.
I think you should be able to rewrite the main loops of those two
functions, (and aditionally rewrite the init code for the models).
(I hope that) I wrote the modeling code in a generic way, so that
shouldn't be too hard.
Nope, sorry.. Haven't maintained it in years, it was mostly a pet project
and I've dropped it around when bzip2 came along. To me, it was mostly
a test to see how much improvement one could get at larger blocksizes
(over a megabyte) and an exercise in optimisation for the bwt step.
I could take a look if I have some spare time, which happens reasonably
often in my current job.