Account Options

  1. Sign in
The old Google Groups will be going away soon.
Switch to the new Google Groups.
Google Groups Home
« Groups Home
Predicting the size of a Huffman block?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  11 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Harry Potter  
View profile  
 More options Feb 8, 9:18 am
Newsgroups: comp.compression
From: Harry Potter <rose.josep...@yahoo.com>
Date: Wed, 8 Feb 2012 06:18:13 -0800 (PST)
Local: Wed, Feb 8 2012 9:18 am
Subject: Predicting the size of a Huffman block?
That's pretty much it: Given the number of occuring symbols and the
length of the file or data to be compressed, how do I estimate the
size of the output?  I have to do this quickly on 8-bit, 16-bit and 32-
bit computers.  I can work on predicting the size of the overhead
myself.
-------------------
Joseph Rose, a.k.a. Harry Potter
Working magic in the computer community

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Richter  
View profile  
 More options Feb 8, 10:55 am
Newsgroups: comp.compression
From: Thomas Richter <t...@math.tu-berlin.de>
Date: Wed, 08 Feb 2012 16:55:59 +0100
Local: Wed, Feb 8 2012 10:55 am
Subject: Re: Predicting the size of a Huffman block?
On 08.02.2012 15:18, Harry Potter wrote:

> That's pretty much it: Given the number of occuring symbols and the
> length of the file or data to be compressed, how do I estimate the
> size of the output?  I have to do this quickly on 8-bit, 16-bit and 32-
> bit computers.  I can work on predicting the size of the overhead
> myself.

What type of information do you have, and which type of answer would you
need? Do you have the Huffman code available? What do you know on the
statistics of the source? Do you need a worst-case or an average-case
answer? Is an estimate sufficient?

The general answer to your question, lacking any further information, is
just "no".

Greetings,
        Thomas


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Earl_Colby_Pottinger  
View profile  
 More options Feb 8, 11:26 am
Newsgroups: comp.compression
From: Earl_Colby_Pottinger <earlcolby.pottin...@sympatico.ca>
Date: Wed, 8 Feb 2012 08:26:19 -0800 (PST)
Local: Wed, Feb 8 2012 11:26 am
Subject: Re: Predicting the size of a Huffman block?
On Feb 8, 10:55 am, Thomas Richter <t...@math.tu-berlin.de> wrote:

> The general answer to your question, lacking any further information, is
> just "no".

Given the number of occuring symbols AND frequency of those symbols
PLUS length of the file.

I believe it is possible to calculate the size of a STATIC HUFFMAN
table and the expected size of the resulting file.

For non-static you probably do almost as much work estimating the file
size as just doing the compression completely.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Harry Potter  
View profile  
 More options Feb 9, 10:41 am
Newsgroups: comp.compression
From: Harry Potter <rose.josep...@yahoo.com>
Date: Thu, 9 Feb 2012 07:41:45 -0800 (PST)
Local: Thurs, Feb 9 2012 10:41 am
Subject: Re: Predicting the size of a Huffman block?
On Feb 8, 10:55 am, Thomas Richter <t...@math.tu-berlin.de> wrote:
> What type of information do you have, and which type of answer would you
> need? Do you have the Huffman code available? What do you know on the
> statistics of the source? Do you need a worst-case or an average-case
> answer? Is an estimate sufficient?

*  General information--I'll do more specific later
*  I'm only starting the code--I don't have it yet
*  The compressor can scan the source file or parts of it for the
statistics of the source befoe compression
*  I need the best possible depending on user settings
*  An estimation will do--and is in fact what I want.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Dow Allen  
View profile  
 More options Feb 9, 11:54 am
Newsgroups: comp.compression
From: James Dow Allen <jdallen2...@yahoo.com>
Date: Thu, 9 Feb 2012 08:54:20 -0800 (PST)
Local: Thurs, Feb 9 2012 11:54 am
Subject: Re: Predicting the size of a Huffman block?
On Feb 8, 9:18 pm, Harry Potter <rose.josep...@yahoo.com> wrote:

> Given the number of occuring symbols and the
> length of the file or data to be compressed, how do I estimate the
> size of the output?

Do you have the histogram [H1, H2, ... Hk] of tokens
to be coded?  If so, the output size in bits is
given approximately by
   T (0.03 + log T) - sum Hj log Hj
where T = sum Hj, and log is base 2.
0.03 is an estimate of the inefficiency (due to
discreteness) of Huffman coding.

Hope this helps.
James


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Richter  
View profile  
 More options Feb 9, 11:56 am
Newsgroups: comp.compression
From: Thomas Richter <t...@math.tu-berlin.de>
Date: Thu, 09 Feb 2012 17:56:01 +0100
Local: Thurs, Feb 9 2012 11:56 am
Subject: Re: Predicting the size of a Huffman block?
On 09.02.2012 16:41, Harry Potter wrote:

> On Feb 8, 10:55 am, Thomas Richter<t...@math.tu-berlin.de>  wrote:
>> What type of information do you have, and which type of answer would you
>> need? Do you have the Huffman code available? What do you know on the
>> statistics of the source? Do you need a worst-case or an average-case
>> answer? Is an estimate sufficient?

> *  General information--I'll do more specific later
> *  I'm only starting the code--I don't have it yet
> *  The compressor can scan the source file or parts of it for the
> statistics of the source befoe compression

Is a two-pass approach acceptable, i.e. determine the relative
frequencies first, design the optimal Huffman code, then encode?

> *  I need the best possible depending on user settings
> *  An estimation will do--and is in fact what I want.

If this is simple zero-order Huffman coder: Set p_i to the relative
frequency of symbol i. ceil(-log_2(p_i)) will give you the number of
bits the Huffman code will assign to the symbol. N \sum_i p_i *
ceil(-log_2(p_i)) will give you the compressed file size in bits, where
N is the number of symbols in the stream. Round up to the next byte,
done. Remember that you somehow also need to signal the Huffman code you
used, i.e. there is an overhead of side-channels not included in the
above computation.

Greetings,
        Thomas


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Harry Potter  
View profile  
 More options Feb 9, 12:09 pm
Newsgroups: comp.compression
From: Harry Potter <rose.josep...@yahoo.com>
Date: Thu, 9 Feb 2012 09:09:36 -0800 (PST)
Local: Thurs, Feb 9 2012 12:09 pm
Subject: Re: Predicting the size of a Huffman block?
On Feb 9, 11:56 am, Thomas Richter <t...@math.tu-berlin.de> wrote:
> Is a two-pass approach acceptable, i.e. determine the relative
> frequencies first, design the optimal Huffman code, then encode?

A two-pass approach is my plan.

> If this is simple zero-order Huffman coder: Set p_i to the relative
> frequency of symbol i. ceil(-log_2(p_i)) will give you the number of
> bits the Huffman code will assign to the symbol. N \sum_i p_i *
> ceil(-log_2(p_i)) will give you the compressed file size in bits, where
> N is the number of symbols in the stream. Round up to the next byte,
> done. Remember that you somehow also need to signal the Huffman code you
> used, i.e. there is an overhead of side-channels not included in the
> above computation.

I don't understand most of the above computations as I am limited to a
high-school math education.  But logarithms are too complex to do on
an 8-bit computer with any semblance of speed.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Harry Potter  
View profile  
 More options Feb 9, 12:25 pm
Newsgroups: comp.compression
From: Harry Potter <rose.josep...@yahoo.com>
Date: Thu, 9 Feb 2012 09:25:07 -0800 (PST)
Local: Thurs, Feb 9 2012 12:25 pm
Subject: Re: Predicting the size of a Huffman block?
I should've mentioned: the figures I have are the size of the block to
compress, the values that occur and the number of occurences for each
value.  However, I don't want to use the number of occurences in the
calculation as it would take too long.  And as for the overhead, I
don't want to reveal how I'm going to do compression, so I'll have to
figure it out myself.  Sorry about the rudeness--it wasn't intended.  :
(

On Feb 9, 12:09 pm, Harry Potter <rose.josep...@yahoo.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Richter  
View profile  
 More options Feb 9, 2:45 pm
Newsgroups: comp.compression
From: Thomas Richter <t...@math.tu-berlin.de>
Date: Thu, 09 Feb 2012 20:45:11 +0100
Local: Thurs, Feb 9 2012 2:45 pm
Subject: Re: Predicting the size of a Huffman block?
On 09.02.2012 18:09, Harry Potter wrote:

> I don't understand most of the above computations as I am limited to a
> high-school math education.  But logarithms are too complex to do on
> an 8-bit computer with any semblance of speed.

Point 1) No, taking the log to the base of two is not a complex
operation. It is a matter of shifting and counting the shifts and is
trivial.

Point 2) Sorry, folks, but without mathematics no compression. The
problem can be solved by getting an education, and only by that. If
that's not your intention - deliver Pizza instead. Sorry to sound so
strict, but it is *really* that simple. As always, it is up to you to
fix it.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Harry Potter  
View profile  
 More options Feb 10, 9:03 am
Newsgroups: comp.compression
From: Harry Potter <rose.josep...@yahoo.com>
Date: Fri, 10 Feb 2012 06:03:15 -0800 (PST)
Local: Fri, Feb 10 2012 9:03 am
Subject: Re: Predicting the size of a Huffman block?
On Feb 9, 2:45 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:
> Point 1) No, taking the log to the base of two is not a complex
> operation. It is a matter of shifting and counting the shifts and is
> trivial.

I can do that.

> Point 2) Sorry, folks, but without mathematics no compression. The
> problem can be solved by getting an education, and only by that. If
> that's not your intention - deliver Pizza instead. Sorry to sound so
> strict, but it is *really* that simple. As always, it is up to you to
> fix it.

You're right.  I think I have some ideas, though.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jim Leonard  
View profile  
 More options Feb 10, 10:06 am
Newsgroups: comp.compression
From: Jim Leonard <mobyga...@gmail.com>
Date: Fri, 10 Feb 2012 07:06:36 -0800 (PST)
Local: Fri, Feb 10 2012 10:06 am
Subject: Re: Predicting the size of a Huffman block?
On Feb 9, 1:45 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:

> Point 2) Sorry, folks, but without mathematics no compression. The
> problem can be solved by getting an education, and only by that. If
> that's not your intention - deliver Pizza instead. Sorry to sound so
> strict, but it is *really* that simple. As always, it is up to you to
> fix it.

For order-0/statistical and complex methods, absolutely.  But I wrote
my own LZ77 compressor long before I understood what those were.  My
math is also limited, but I've tried to make up for it with excessive
research.

I have entertained the idea of going back to college (ie. night school
or community college) to complete calculus and trigonometry.  I have
been using many methods and formulas in my hobby work for over a
decade without fully understanding them.  The proverbial "light bulb"
has not yet lit up yet.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »