Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Arithmetic Assistance

0 views

Skip to first unread message

Einstein

unread,

May 14, 2008, 10:16:49 PM5/14/08

I am trying to learn more of the Arithmetic method of compression. So
I laid this model to myself....

I have a 2 in 14 chance to have a compressible result, and a 12 in 14
chance of having a non-compressible result. Statistically speaking of
course. Now I figure the cost to determine will be higher than the
compression saved, I am now convinced of the 'no random recursive
compression' plan... but I am trying to create a basis to do some
compression models, but I am having a hard time with arithmetic
compression.

Would I use a similar to huffman method to determine if I have
compression?

When I do that I have two odds, 1 or 0. There are 4 outcomes. I take
the percentages of 1+1 * 3, 1+0 * 3, 01 *2 and 00 *1 and divide all
four results to get the compression (Or size increase) of the simple
huffman. If this is the way this math is calculated then that is easy
enough. But is there alternatives? I feel like I am missing something
here...

Thomas Richter

unread,

May 15, 2008, 2:47:19 AM5/15/08

Einstein schrieb:

> I am trying to learn more of the Arithmetic method of compression. So
> I laid this model to myself....
>
> I have a 2 in 14 chance to have a compressible result, and a 12 in 14
> chance of having a non-compressible result. Statistically speaking of
> course. Now I figure the cost to determine will be higher than the
> compression saved, I am now convinced of the 'no random recursive
> compression' plan... but I am trying to create a basis to do some
> compression models, but I am having a hard time with arithmetic
> compression.

Sounds like you are confused. For compression, you need to look at the
statistics of the symbols. The "chance" to get compression can then be
computed from that - in fact, it's the entropy of the source statistics.

To understand arithmetic coding, there is a good article by Nelson in
the Dr. Dobbs Journal (insert this as search phrase into google, it's
the first hit). In short, you represent your symbols by intervals whose
sizes are proportional to the probability of the symbols.

> Would I use a similar to huffman method to determine if I have
> compression?

You can *determine* whether you have compression by comparing the size
of the output with the size of the input. You can *predict* whether you
can expect compression by comparing the entropy (= average number of
bits per symbol) with the number of bits per symbol you have on the
input. In fact, unless your source statistics is uniform, you can expect
compression for *all* "typical" sequences (typical in the sense that
their statistics is close to the statistics you started with to set up
the encoder). You still don't get compression for *all* sequences, and
*most* sequences (in the sense of how much of the available input space)
will expand. You still get compression on average because the sequences
that expand are untypical, i.e. less likely.

> When I do that I have two odds, 1 or 0. There are 4 outcomes. I take
> the percentages of 1+1 * 3, 1+0 * 3, 01 *2 and 00 *1 and divide all
> four results to get the compression (Or size increase) of the simple
> huffman. If this is the way this math is calculated then that is easy
> enough. But is there alternatives? I feel like I am missing something
> here...

I'm probably missing your question. It seems you start the wrong way
'round. Start with a source alphabet and a random source, and from that
compute whether it is compressible or not.

So long,
Thomas

0 new messages