Google Groups Home
Help | Sign in
Message from discussion Compression as a measurement standard
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
Industrial One  
View profile
 More options May 16, 2:52 pm
Newsgroups: comp.compression
From: Industrial One <industrial_...@hotmail.com>
Date: Fri, 16 May 2008 11:52:20 -0700 (PDT)
Local: Fri, May 16 2008 2:52 pm
Subject: Compression as a measurement standard
So far we got specified standards and entropy encoders that
efficiently compress text, images, audio/video. They all reduce the
target data by a certain amount depending on the "complexity" of the
content. I was individually compressing a couple large text files to
cut my dirt-poor friend some slack since his dumb ass is still using
dial-up, and it sparked my attention when I noticed that some text
files had a higher compression ratio than others. I checked 'em both
out and couldn't figure out what's more "redundant" about either of
'em since they're similar uncompressed size and don't have any
significant differences that would allow better compression such as 10
repetitions of the same copyright disclaimer every page that some
worthless shit documents contain. I then realized that the TXT with a
lower compression ratio had many typos, mispellings and faulty,
kindergarten use of punctuation. I had MSWord correct everything
automatically, and it now compressed better, like I expected.
Obviously, a typo may consist of several variations of the same word,
e.g. "that" "thta" "taht" etc. so it would make sense that text
without typos would be more "redundant" since it would contain more
repetitions of some words and would allow more leeway for the LZW
dictionary with less unique phrases to map.

Now, I'm wondering if the use of compression was ever adopted as a
qualifiable means of "measuring" the data in some way. In other words,
does it mean anything if a book I wrote in .txt had a lower
compression ratio than the book that faggot next door wrote? Assuming
proper punctuation/grammar/whatever, does it mean my story uses a
higher vocabulary, repeats boring bullshit less often, etc.? Must mean
SOMETHING. And if 2 movies use the same codec (say H.264) with the
same preferences and one is 20 MB smaller despite the same length,
then obviously it contains more scenes with minimal motion (e.g. a
couple homos staring at a Graham Chapman poster.) But do any of the
associates of the production of that movie employ compression to "rate
the complexity" of the overall scene or movie? Do ANY official
businesses indulge in such practices?

I'd guess no, since a lossy algorithm can't accurately offer the most
efficient results, and since some technical "complexity" would qualify
but be garbage. E.g. nobody would watch a movie consisting of random
pixels every new frame.

What about text, what about audio? What does compression mean in this
context?


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google