Honest question (random compression)

Tom St Denis

unread,

Jan 6, 2010, 10:49:57 AM1/6/10

to

Just stopped from my daily grind at work and reading usenet long
enough to think about this question:

Why is random data compression the holy grail of data compression
[choose your adjective...]? We don't *use* random data. I listen to
MP3 audio, I look at MPEG videos, I talk over a LP encoded phone,
etc...

Why not compression nutters brag about a better audio codec or video
codec or bytecode codec? ...

Tom

Thomas Richter

unread,

Jan 6, 2010, 11:17:25 AM1/6/10

to

Tom St Denis wrote:

> Why is random data compression the holy grail of data compression
> [choose your adjective...]?

It isn't. Only con-artists or trolls believe it to be the holy grail,
misunderstanding either the word "random" or the word "compression".

> We don't *use* random data.

Bingo. Typical data used by humans is not random, but highly redundant.

> I listen to
> MP3 audio, I look at MPEG videos, I talk over a LP encoded phone,
> etc...

That's another type of compression, namely lossy compression. Also
valuable, but with different limits, of course.

The limit of lossless compression is the inability to compress all
finitely sized sequences of size <= N, the answer is the "counting
argument".

For lossy compression, a similar threshold exists, telling you that it
is impossible to compress data under the constraint of keeping the
distortion (inverse of "quality") below a given value. This curve is the
rate-distortion curve, and some hard facts are known about it. For
example, for distortion = mean square error, the Gaussian i.i.d source
is hardest to compress, and the r-d curve for this source is known. Note
that this is a *different* "hard case" than in the lossless case, here
the uniform i.i.d. source is hardest to compress (namely, not at all, on
average).

> Why not compression nutters brag about a better audio codec or video
> codec or bytecode codec? ...

If they would, they won't be nutters. But rather working on such codecs,
like in the MPEG or JPEG committee. No, we don't have nutters here, but
hard working colleagues.

So long,
Thomas

Bruce Guenter

unread,

Jan 6, 2010, 11:31:53 AM1/6/10

to

On 2010-01-06, Tom St Denis <t...@iahu.ca> wrote:
> Why is random data compression the holy grail of data compression
> [choose your adjective...]? We don't *use* random data.
>

> Why not compression nutters brag about a better audio codec or video
> codec or bytecode codec? ...

Simply put, because being able to compress random data means you can (at
least in theory) compress *anything* better. That implies, for example,
that you could compress better than the best audio codec, since you
could just re-compress the output of said codec.

--
Bruce Guenter <br...@untroubled.org> http://untroubled.org/

Tom St Denis

unread,

Jan 6, 2010, 12:28:01 PM1/6/10

to

On Jan 6, 11:17 am, Thomas Richter <t...@math.tu-berlin.de> wrote:
> If they would, they won't be nutters. But rather working on such codecs,
> like in the MPEG or JPEG committee. No, we don't have nutters here, but
> hard working colleagues.

Thanks for the reply, though my post was a bit rhetorical. I was just
ranting against the trolls by showing that their work product is
useless... :-)

One of those days...

Earl_Colby_Pottinger

unread,

Jan 6, 2010, 2:29:20 PM1/6/10

to

> Why not compression nutters brag about a better audio codec or video
> codec or bytecode codec? ...

Because you can post garbage about random data compression that only
knowledgable people can spot the logic faults to. The average Joe can
be fooled ablout what they are claiming to do.

Claim to produce a better audio/video, and anyone who is not deaf/
blind can test thier claims and see that they are full of BS.

PS, the tests still apply to RDC but you must have seen the excuses
they make up not to use 'real' data to test thier systems too.

Earl Colby Pottinger

Ernst

unread,

Jan 7, 2010, 5:22:37 PM1/7/10

to

On Jan 6, 11:29 am, Earl_Colby_Pottinger

I would be one of the laymen who find trying to compress the file
"million digit random data " interesting but who must yield to the
more knowledgeable folk..

I left the term " Holy Grail" on Mark's wonderful blog and I meant it
as the quest for some honor that is imagined.

I read once that if we could compress data that most say is
uncompressable it in essence is a miracle. So being a Monty Python
fan
I start to hear coconuts for horses as I try to find some way.

The joke here, on me, is that I confess to getting dreamy - happy when
I hope some encoding Idea will turn out to compress random data.

I have been within one bit of one bit of compression per cycle a few
times so "Seeking that Holy Grail" is the term I coined for me and
shared with you I see.