Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

"patent"ing a compression algorithm

5 views
Skip to first unread message

Lynn Lantz

unread,
Dec 18, 1998, 3:00:00 AM12/18/98
to
I've developed a new compression algorithm and would like to get it
patented. Does anyone have experience with this? does every detail have to
accompanied by the detailed mathematical proofs? how much of the internals
and "secrets" have to be disclosed to file for a patent?
Thanks for any info.

Walter Roberson

unread,
Dec 19, 1998, 3:00:00 AM12/19/98
to
In article <SKDe2.1334$8R6....@news15.ispnews.com>,
Lynn Lantz <lel...@bignet.net> wrote:
:I've developed a new compression algorithm and would like to get it

:patented. Does anyone have experience with this? does every detail have to
:accompanied by the detailed mathematical proofs? how much of the internals
:and "secrets" have to be disclosed to file for a patent?

The patent process is fairly expensive -- several thousand dollars
by the time you get through. Is your compression algorithm
such a substantial improvement over others in the same field that you
are realistically going to be able to earn back that money through
license sales of your algorithm ?

Please recall that simply having better compression is not what it
takes to win in the marketplace. Your compression has to be put to
practical uses, like being made a component of (oh let's say) WebTV
interactive control of football game camera viewpoints. It can't just
be better, it has to be -enough- better that people pay attention to it
and are willing to change their products to use it. So you have to think
of things like "How well would my product fit with JBIG or MPEG?", and
"Am I going to be able to market my algorithm to the people who make
PKZip as being better than anything they have and better than their
competition like ARCJ?"


Most people who patent lose money because they don't take into account
the cost of the patent process and the realities of marketing. It
is NOT the case that the -technically- best product wins in the market.

Mark Nelson

unread,
Dec 19, 1998, 3:00:00 AM12/19/98
to
Lynn Lantz wrote in message ...

>I've developed a new compression algorithm and would like to get it
>patented. Does anyone have experience with this? does every detail have
to
>accompanied by the detailed mathematical proofs? how much of the internals
>and "secrets" have to be disclosed to file for a patent?


In the United States you are more or less obligated to disclose every part
of your invention that you expect to protect. This doesn't mean you have to
disclose internal methods of operation, but at the same time you will not
protect those internal methods against copying.

Detailed mathematical proofs are not only unnecessary, but I suspect that a
patent attorney would advise against their inclusion. Details tend to limit
the scope of your patent.

If you are doing this as an individual, you can expect to spend a fairly
large sum of money on your patent application, and not have much of a chance
of recovering it in royalties or sales. For the most part, making money from
patents is a game played by big companies. So unless your invention has huge
commercial potential, don't set your hopes too high.

Probably the biggest problem you face as an individual with a patent is
actually defending it. If Microsoft decides that they want to blatantly copy
your algorithm, what are you going to do? Lawsuits are very expensive, and
it's hard to find backers unless you are a really good salesman.

An alternative place to spend your time would be as a thesis or book topic.
Less money, but less risk!

Lynn Lantz

unread,
Dec 21, 1998, 3:00:00 AM12/21/98
to
Walter, thanks for the response. I do believe that my product is superior
to all other compression algorithms to date. My problem is (as just as a
person) I don't know where to go next with it. I'd love to speak with pkzip
or some other company about it but feel like I should get it patented first
for my protection.
It is lossless, works on any file type on any platform, is recursive (ie,
the compressed file can be compressed again, and then again, and so on) so
that ANY size file can be compressed to 16 bytes or less. I know this
sounds like "perpetual motion" but it does work and I can prove it.
I'm just looking for some advice as to where to go next with this.
thanks.

Walter Roberson wrote in message <75f39o$kmd$1...@canopus.cc.umanitoba.ca>...

Aandi Inston

unread,
Dec 21, 1998, 3:00:00 AM12/21/98
to
"Lynn Lantz" <lel...@bignet.net> wrote:

>Walter, thanks for the response. I do believe that my product is superior
>to all other compression algorithms to date. My problem is (as just as a
>person) I don't know where to go next with it. I'd love to speak with pkzip
>or some other company about it but feel like I should get it patented first
>for my protection.
>It is lossless, works on any file type on any platform, is recursive (ie,
>the compressed file can be compressed again, and then again, and so on) so
>that ANY size file can be compressed to 16 bytes or less.

Mathematical proofs may be a problem because it is easy to
mathematically prove that such a compression algorithm does not exist.

Do take a look at the comp.compression FAQ which not only covers the
proof but discusses at least one other invention similar to yours.
----------------------------------------
Aandi Inston qu...@dial.pipex.com
Visit http://www.quite.com/ps/ for information on
PostScript, including EPS in 10 easy stages.

Lynn Lantz

unread,
Dec 21, 1998, 3:00:00 AM12/21/98
to
Aandi, thanks for the kind response and for not dismissing me offhand (like
an email I rec'd). I went to the FAQ as you suggested and will study it
further tomorrow. However I point out a couple things:
1. It is clearly not a fake/sham/scam algorithm.
2. More than one file can have the same compressed file if there is also a
count of how many iterations it took to get there. If one file takes 10
iterations to get there and one file takes 15 iterations to get there, then
they are not the same. The count of how many iterations only adds 1 or 2
bytes at the very end (not at each iteration).
3. I'm not saying it compresses to 0, there is a bottom limit which I
believe to be around 16 bytes. This still allows for 2^128 files (1 with 38
zeroes) that are possible. If all 6 billion people lived eighty years, they
would have to develop 2^50 (1 with 20 zeroes) files every SECOND of their
lives to come up with that many files. It is foolish to say recursive
doesn't work because it does - likewise foolish to say all files can go to
zero.

I still believe mine works as stated, but as I said, I will continue to
study the FAQ and the documents it points to tomorrow.
Thanks.


Aandi Inston wrote in message <367ed6cf....@news.dial.pipex.com>...

Stephen Anderson

unread,
Dec 22, 1998, 3:00:00 AM12/22/98
to
Lynn

My advice to you would be to go and see a patent attorney. Especially
when it comes to software related inventions.

I will break my advice up into several parts for you.

1 DO YOUR HOMEWORK

Just because you independantly developed your software, doesn't mean
that somebody else doesn't already have a patent similar to, or
exactly like your technology, or that there isn't already a prior
publication in the system that kills the patentabiltiy of your
invention.

To give yourself a better idea of what the state of the art is in any
particular branch of technology, you should go and visit your local
Patent Office - Suboffice in your nearest capital city. If you live
in any country other than the United States, you will use the
International Patent Classification (IPC) system. It has every piece
of technology from a to z classified into categories. You can then go
into the patent office library and look at all the patents that have
ever been filed in your country that has been categorised into the IPC
class you are interested in. In other words, the IPC is a way of
reducing the size of the haystack so that you may find that needle
more easily. Patent Office staff are usually very helpfull in finding
the right IPC classification for you, but they can be very unreliable
in finding the right classification. Patent Attorneys charge a small
fortune for their time, so the more work you can do yourself, the
cheaper the costs will be. Also you could consider subscribing to a
web based database of patent literature. The one I use is at
www.dialogweb.com However these services are best used by skilled
operators who have been trained. I am not sure if you get any free
time to trial the service before the charges kick in.

The US uses a completely different classification system, but it does
essentially the same thing.

2 MATURE YOUR IDEA

Make sure you have a mature invention ready before you take it to a
patent attorney. Once The attorney has completed the specification to
your satisfaction, and it has been filed, the clock is ticking. You
have 12 months to find a partner with deep pockets, who will licence
the technolgy from you and pay the high cost of local and foreign
patent filings and prosecutions. The more mature your concept is, the
better because the attorney has a more complete picture of the
invention and can draft a more thorough specification accordingly,
also it is easier to sell to a potential licencee. MAKE SURE YOU HAVE
AN APPLICATION FILED BEFORE YOU HAWK IT AROUND TO POTENTIAL LICENCEES.
Any publication, including commercial dealing in your invention before
it has been filed, may destroy the patentability of your invention.

3 VISIT A PATENT ATTORNEY

Before you visit the attorney, make sure they have experience in
filing and litigating software related patents. Software related
patents are a specialised field in the industry so don't just choose
the first firm you find in the telphone directory. When you do your
homework at step 1, take note of which patent attorney firms are
filing the applications. See if there is any trend there. Some firms
are better at some things than others, all know how to charge like a
wounded bull, so make sure you get a full breakdown of the costs up
front. Some firms will offer you a fixed price to draft the patent
specification, others will charge by the hour. In Australia,
depending on the complexity of the technology, we can get a draft
patent specification for around AU$1,500 to $2,000. However complex
technology will cost more to draft. You can cut down this cost as
much as possible by providing the attorney with a well written and
precise description and summary of your invention. Also, if you are
able to prepare any or all of the drawings required, then you will
save money there too.

4. AFTER THE APPLICATION HAS BEEN FILED

As I stated earlier, you have 12 months from the date of filing the
application before you will be required to file corresponding
international applications on your invention. As a general rule of
thumb, if you are entering less than about 4 or 5 countries, then it
is cheaper for the attorney to ask his foreign associates in those
countries to file a convention application directly. However, if you
are going into more countries than about 5, your attorney should
advise you to file a Patent Cooperation Treaty (PCT) application.
Either way, you are up for significant fees at the twelve month stage
to enter the foreign countries. This is where having a rich licencee
is really useful. If you cannot find a rich licencee in the 12 month
time frame, do not despair, all is not lost.

Remember that one of the things that will destroy the patentability of
your invention is prior publication of it in a trade journal or patent
specification etc. Also commercialy dealing in the invention is also
considered a publication of the invention. In most countries, except
the USA publication of your application automatically occurs 18 months
after the date of first filing your patent specification. You can
usually withdraw your application up to very near the 18 month date
and preserve the confidentiality of your invention. In other words,
you can withdraw your application before it is due to be published.
This preserves the patentabiltiy of your invention. Providing the
invention hasn't changed much, you or your attorney can re-file the
application and start the clock ticking all over again for a small
Patent Office official fee because most of the attorneys costs come
from the drafting of the specification that you already have.

Once publication has occured, either by the patent office after the 18
month date, or by yourself, either by blabbing it out to all your
friends at a party or in some commercial dealing in the technolgy, or
by having your local lifetyle telivision program come and do a piece
on it for their show, there is no turning back. If you were to
withdraw the application after publication occurs, then those
publications would destroy the patentabiltiy of your invention. So be
extremely careful about publication issues. Make sure all your
dealings with potential licencees are done under strict
confidentiality. Make sure you get a signed confidentiality agreement
with them before disclosing your invention.

I hope this has been helpful to you. The patent system can be very
rewarding, but there are many traps for new players, and it is easy to
lose your money. Get a good attorney to help you, and you have less
pitfalls to worry about.

Email me if you have any questions at dec...@bigfoot.com I will be
glad to offer you whatever advice I can.

Al lthe best of the season

Stephen Anderson
Intellectual Property Advisor
BHP
Melbourne
Australia

Walter Roberson

unread,
Dec 22, 1998, 3:00:00 AM12/22/98
to
In article <lDzf2.1219$D76....@news3.ispnews.com>,

Lynn Lantz <lel...@bignet.net> wrote:
:Walter, thanks for the response. I do believe that my product is superior
:to all other compression algorithms to date.

:It is lossless, works on any file type on any platform, is recursive (ie,


:the compressed file can be compressed again, and then again, and so on) so

:that ANY size file can be compressed to 16 bytes or less. I know this


:sounds like "perpetual motion" but it does work and I can prove it.

As we are talking about files, we'll assume that the result of
any compression is another file -- i.e., the output is on full byte
boundaries.

This being the case, and if one supposes that a file of length N can be
distinguished from a file of length N+1, then:

There is 1 possible resulting file of length 0 bytes (empty file)
There are 256^1 possible resulting files of length 1 byte
There are 256^2 possible resulting files of length 2 bytes
There are 256^3 possible resulting files of length 3 bytes
[and so on up to]
There are 256^16 possible resulting files of length 16 bytes.

Thus, the total number of possible resulting files is

sum( i = 0 to 16, 256^i ) = 341616807575530379006368233343265341697

which is a hair over 2^128.


As "ANY size file can be compressed to 16 bytes or less", it follows that
the ultimate compression result of any file must be one of those
341616807575530379006368233343265341697 files.

Now, I invite you to start compressing all files of exactly 17 bytes long.
Compress 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 . The
result must be one of those 341616807575530379006368233343265341697
files. Compress 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01.
The result must be one of those 341616807575530379006368233343265341697 files.
Compress 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02. Compress
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03, all the way
to 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff and start in on
00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 and keep going until
you're working on 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00. Keep
working. Each time, the compression result must be one of those
341616807575530379006368233343265341697 files. Eventually you
will have worked right through to
FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF, the compressed
result of which must be one of those 341616807575530379006368233343265341697
files.

So, how many files would you have just compressed?
256^17 of them, which is 2^136 =

87112285931760246646623899502532662132736

and each of these 2^136 files must have compressed to become one of those

341616807575530379006368233343265341697

potential files.


Hmmm... that's odd. Somehow you've managed to pack

87112285931760246646623899502532662132736 original files into
341616807575530379006368233343265341697 compressed files.


On average, that means that each of the compressed files would
have to stand in for 255 original files. Indeed,

sum( i = 0 to 16, 256^i ) * 255 + 1 = 256 ^ 17 which is easy to understand
if you work it out in binary.


One would wonder exactly how it is you are going to manage to do this
packing of 256^17 original files into a space 255 times smaller
AND STILL BE ABLE TO FIGURE OUT WHICH WAS WHICH.


When you work your way through the list compressing, all the way to
01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 you will have
compressed 341616807575530379006368233343265341697 files [getting
that many different result files.] Now what happens when you go to compress
01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 02 ? It's the
3416168075755303790063682333432653416978'th file, which one of the
3416168075755303790063682333432653416977 available slots is it going to
occupy?

graham...@hotmail.com

unread,
Dec 22, 1998, 3:00:00 AM12/22/98
to
In article <lDzf2.1219$D76....@news3.ispnews.com>,
"Lynn Lantz" <lel...@bignet.net> wrote:
> It is lossless, works on any file type on any platform, is recursive (ie,
> the compressed file can be compressed again, and then again, and so on) so
> that ANY size file can be compressed to 16 bytes or less.

Hmm... You've just disclosed a problem in your algorithm. If *any* file
could be compressed to 16 bytes, then only 2^128 different files could be
represented with your method. Considering that in DOS there are
2^4294967296-1 different possible files, you've got a problem. This is
explained in the FAQ.

- GLYPH

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

pi...@cs.uu.nl

unread,
Dec 22, 1998, 3:00:00 AM12/22/98
to
>>>>> "Lynn Lantz" <lel...@bignet.net> (LL) writes:

LL> Walter, thanks for the response. I do believe that my product is superior
LL> to all other compression algorithms to date. My problem is (as just as a
LL> person) I don't know where to go next with it. I'd love to speak with pkzip
LL> or some other company about it but feel like I should get it patented first
LL> for my protection.
LL> It is lossless, works on any file type on any platform, is recursive (ie,
LL> the compressed file can be compressed again, and then again, and so on) so
LL> that ANY size file can be compressed to 16 bytes or less. I know this
LL> sounds like "perpetual motion" but it does work and I can prove it.
LL> I'm just looking for some advice as to where to go next with this.

In that case you can just save a lot of money by not filing a patent
application. It just won't work and it has been proven not to work.
--
Piet van Oostrum <pi...@cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP]
Private email: Piet.van...@gironet.nl

Aandi Inston

unread,
Dec 22, 1998, 3:00:00 AM12/22/98
to
"Lynn Lantz" <lel...@bignet.net> wrote:

>3. I'm not saying it compresses to 0, there is a bottom limit which I
>believe to be around 16 bytes. This still allows for 2^128 files (1 with 38
>zeroes) that are possible. If all 6 billion people lived eighty years, they
>would have to develop 2^50 (1 with 20 zeroes) files every SECOND of their
>lives to come up with that many files. It is foolish to say recursive
>doesn't work because it does - likewise foolish to say all files can go to
>zero.

I'm not a mathematician, but two things make me uneasy about the idea.

1. Since any file can be compressed to 16 bytes, that implies that all
data on your hard disk could be compressed to 16 bytes; so could all
the data on the internet; and the sum of all human knowledge,
including everything not yet discovered. I find it hard to get my head
around that, but it would certainly be a winner. I'd only ever need
one floppy disk, ever.

2. More directly, although there are unlikely to be more than 2^128
specific files actually in existence (about 10^38), that implies that
you could represent every file actually in existence, but you would
need to have in your decoder a list of all files (code 1 =
Shakespeare's Hamlet, 1992 Penguin edition; code 2 = Nasa's picture
number 3243 of Mars; code 3 = American Express's customer list; code 4
= my personal diary for 2 Feb 1998; code 5 = New York Times shares
listing, 1 January 2001; code 6 = a ZIP file of my entire hard disk at
3 am this morning, and so on). Clearly you can't have such a list,
because some of that data is not available to you (ignoring the fact
that the list would be very large).

Since you can't have a list of files, that implies that the 2^128
files you can generate are defined in advance by the algorithm. And
that suggests to be that because there are far more than 2^128
*possible* files, there must be some files which you cannot make by
decompressing any possible 16 byte file.


As I understand mathematical proofs, once a proof exists people will
not look at your arguments, proposals or anything else; the only thing
that matters is demonstrating an error in the proof. Once you have
demonstrated that all known proofs (which suggest that this can't be
done) have an error, people will sit up and take notice. Disproving
even one of them would be a good start.

Of course, it isn't really the mathematicians you have to convince,
but the investors. But that could be hard while the mathematicians are
saying loudly that it is impossible, and hence implying that it must
be a scam.

Walter Roberson

unread,
Dec 22, 1998, 3:00:00 AM12/22/98
to
In article <x1Ef2.398$Wg6....@news13.ispnews.com>,
Lynn Lantz <lel...@bignet.net> wrote:
:2. More than one file can have the same compressed file if there is also a
:count of how many iterations it took to get there. If one file takes 10

:iterations to get there and one file takes 15 iterations to get there, then
:they are not the same. The count of how many iterations only adds 1 or 2
:bytes at the very end (not at each iteration).

If you work out the math, it turns out that the count of iterations
will take precisely enough storage that recursive compression doesn't
work.

You have to put the compression count at the beginning, by the way,
with some variable-length storage mechanism -- if you put it at the
end, you cannot tell when the compressed data stops and the iteration
count begins.

Midnight's Fire

unread,
Dec 22, 1998, 3:00:00 AM12/22/98
to
For general information...

I have been working on a simular compression for years... And am employing a
patent attorney. There is (as of yet) no patent on
1. Recursive BINARY compression
2. Recursive BYTE LEVEL compression
3. Recursive WAVEFORM compression

So no danger of duplicating with anything that is close to 75% recursive...

Stephen Anderson wrote in message <367f04ca...@news.bigpond.com>...

Lynn Lantz

unread,
Dec 22, 1998, 3:00:00 AM12/22/98
to
Aandi, I think you are right. It sounds like I need to disprove a proof to
go any further. Having a degree in Math I know the foolishness of fighting
the certainty of mathematics. All I can say is that I on the flip side I
can't find any error in my algorithm. I fully understand the counting
number theorem in the FAQ and the fact that you can't put 16 birds in 15
holes. All I can say at this point (until I get to a much better
mathematician than myself) is that the flaw I see is that every bit only has
2 choices and doesn't take into account that that bit could control/leverage
thousands of bits. For example, if you did 10 iterations to compress a
file, it has to go through 10 iterations during decompression. It only
takes 1 bit to say do it a 10th time (don't stop at 9) but when the
decompression is done it may add thousands of bits/bytes to the file. So
that "one" bit may be controlling/leveraging thousands of bits. Does that
make sense? It's the best explanation I can give until I get with a math
genius.
Again, thanks for the reply.

Aandi Inston wrote in message <367f66c6....@news.dial.pipex.com>...

Lynn Lantz

unread,
Dec 22, 1998, 3:00:00 AM12/22/98
to
Walter, on first point see response to Aandi right before this one. On
second point that is accounted for.

Walter Roberson wrote in message <75nsop$hec$1...@canopus.cc.umanitoba.ca>...

Lynn Lantz

unread,
Dec 22, 1998, 3:00:00 AM12/22/98
to
I didn't mean 16 bytes on the nose. I was using that as about the break off
point. It may end up being 15 bytes, 4 bits, and 4 added filler bits. Or
end up being 15 bytes, 7 bits, and 1 filler. OR 17 bytes, OR 20 bytes, OR
etc. etc. I'm just stating that once you get down that low it looses it
effectiveness.

graham...@hotmail.com wrote in message
<75ne6a$kbn$1...@nnrp1.dejanews.com>...

Lynn Lantz

unread,
Dec 22, 1998, 3:00:00 AM12/22/98
to
Thank you very much for this information and for taking the time to write
and share all this with me. It is greatly appreciated and will be taken to
heart.

Thanks again,
Lynn.

Midnight's Fire

unread,
Dec 22, 1998, 3:00:00 AM12/22/98
to
Actually Walter...
you can store the counter in each string before it is compressed.
Assuming you have a fixed loss each cycly, (and to do this you have to) .

Walter Roberson wrote in message <75nsop$hec$1...@canopus.cc.umanitoba.ca>...
>In article <x1Ef2.398$Wg6....@news13.ispnews.com>,
>Lynn Lantz <lel...@bignet.net> wrote:
>:2. More than one file can have the same compressed file if there is also
a
>:count of how many iterations it took to get there. If one file takes 10
>:iterations to get there and one file takes 15 iterations to get there,
then
>:they are not the same. The count of how many iterations only adds 1 or 2
>:bytes at the very end (not at each iteration).
>

pi...@cs.uu.nl

unread,
Dec 23, 1998, 3:00:00 AM12/23/98
to
>>>>> "Lynn Lantz" <lel...@bignet.net> (LL) writes:

LL> Aandi, I think you are right. It sounds like I need to disprove a proof
LL> to go any further. Having a degree in Math I know the foolishness of
LL> fighting the certainty of mathematics. All I can say is that I on the
LL> flip side I can't find any error in my algorithm. I fully understand
LL> the counting number theorem in the FAQ and the fact that you can't put
LL> 16 birds in 15 holes. All I can say at this point (until I get to a
LL> much better mathematician than myself) is that the flaw I see is that
LL> every bit only has 2 choices and doesn't take into account that that
LL> bit could control/leverage thousands of bits. For example, if you did
LL> 10 iterations to compress a file, it has to go through 10 iterations
LL> during decompression. It only takes 1 bit to say do it a 10th time
LL> (don't stop at 9) but when the decompression is done it may add
LL> thousands of bits/bytes to the file. So that "one" bit may be
LL> controlling/leveraging thousands of bits. Does that make sense? It's
LL> the best explanation I can give until I get with a math genius. Again,
LL> thanks for the reply.

You should just add the number of bits needed to encode the number of
iterations to the number of bits in the compressed output. Plus any bits
needed to tell where the one stops and the other starts. Then apply the
pigeon hole argument to the total number of bits, and see that it can't
work.
If that doesn't convince you, do an experiment and compress any file that
someone gives you to the number of bytes your algorithm is supposed to
achieve, whether that is 16, or 20, or 128 or whatever low number it is
supposed to do. The difficult part is to set up an experiment in such a way
that cheating is impossible.

And in fact this discussion comes up every so many months, so I suspect it
is useless.

Phil Norman

unread,
Dec 23, 1998, 3:00:00 AM12/23/98
to

And, just to cover all the possibilities, make sure it decompresses
too. It's very easy to produce what seems like a wonderfully good
compressor, and to forget to check that it works by using the ultimate
test - the decompressor.

Cheers,
Phil

graham...@hotmail.com

unread,
Dec 23, 1998, 3:00:00 AM12/23/98
to
Actually, Walter is exactly correct on his first point. Any compression
scheme that claims to work on "all data" necessarily tries to model all data.
It's very solid information theory that tells us if you model all data, you
get a model that cannot be used to compress things with.

This directly implies that if you use a recursive scheme, the data telling you
how many time you've recursed will be the same size or larger than the data
you've saved.

Now, about your comment about the counting argument:
"the flaw I see is that every bit only has 2 choices and doesn't take into
account that that bit could control/leverage thousands of bits."

Despite the apparent advantages of using iteration, the Counting Argument is a
fully generalized proof, and covers EVERY POSSIBLE ENCODER, no matter what
scheme you're using. IF YOU CAN SEND MESSAGES WITH IT, THE COUNTING ARGUMENT
APPLIES. Sorry for yelling, but everyone should know this important fact.

On Walter's second point, I'd have to disagree. You could devise a scheme
where the compressed data uses prefix codes in such a way that you know where
they stop without knowing the iteration count, though it might be less
efficient to implement it that way. Or, you could just read the file
backwards and use suffix codes :)

- GLYPH


In article <GSVf2.88$14....@news13.ispnews.com>,


"Lynn Lantz" <lel...@bignet.net> wrote:
> Walter, on first point see response to Aandi right before this one. On
> second point that is accounted for.
>

> Walter Roberson wrote in message <75nsop$hec$1...@canopus.cc.umanitoba.ca>...

> >If you work out the math, it turns out that the count of iterations
> >will take precisely enough storage that recursive compression doesn't
> >work.
> >
> >You have to put the compression count at the beginning, by the way,
> >with some variable-length storage mechanism -- if you put it at the
> >end, you cannot tell when the compressed data stops and the iteration
> >count begins.
>

-----------== Posted via Deja News, The Discussion Network ==----------

Sean Lambert

unread,
Dec 23, 1998, 3:00:00 AM12/23/98
to Lynn Lantz
On Tue, 22 Dec 1998, Lynn Lantz wrote:

> Aandi, I think you are right. It sounds like I need to disprove a proof to
> go any further. Having a degree in Math I know the foolishness of fighting
> the certainty of mathematics. All I can say is that I on the flip side I
> can't find any error in my algorithm. I fully understand the counting
> number theorem in the FAQ and the fact that you can't put 16 birds in 15
> holes. All I can say at this point (until I get to a much better

Apparently you don't understand it, so let me try and help.
Since you are a mathemetician, let's look at the decompressor
as a function f(x). This function is a magic black box which
takes the compressed file x as its input, and returns the
uncompressed file y as its output. All of the decisions about
repeating the decompression (to account for the recursion) are
done inside this box.

Now look at the number of inputs the box has, let's say it's
files up to X bits long, so that's ((X+1)^2)-1 inputs, if you
count the empty file. Now for there to be more than ((X+1)^2)-1
outputs, two inputs have to decompress to different files. This
would mean the decompressor decided which file to decompress to
based on data which was NOT IN the file itself.

Here is the problem. Take some file f and run it through your compressor
once. Now it looks exactly like file g. Compress this file further
until you are happy. When you decompress it, the compressor cannot know
whether it was supposed to stop at f or g!!

Prove it to yourself via empirical data:

Here is a simple way to test any compressor to prove the counting
arguement does indeed apply to it.

1) Create a file of random bits, using a quality uniform random
bit generator. Save this file.

2) Compress this file, then check the size to make sure some
compression took place. If there was no compression stop.

3) Decompress this file and compare it to the original. If
they are not the same stop.

4) Go back to step 1.

This process can be automated, and you can just turn it on and
leave. It should only take a few iterations before this process
stops, even for the best compressors.

> mathematician than myself) is that the flaw I see is that every bit only has


> 2 choices and doesn't take into account that that bit could control/leverage

> thousands of bits. For example, if you did 10 iterations to compress a

It doesn't take that into account because the process is not important.
The counting arguement simply describes why for a certain number of
inputs and a 1-to-1 function, there can be only a certain number of
outputs. If your decompressor is not a 1-to-1 function, then the
counting arguement does not apply. But then again, who wants a
decompressor that is not 1-to-1?

> file, it has to go through 10 iterations during decompression. It only
> takes 1 bit to say do it a 10th time (don't stop at 9) but when the
> decompression is done it may add thousands of bits/bytes to the file. So
> that "one" bit may be controlling/leveraging thousands of bits. Does that
> make sense? It's the best explanation I can give until I get with a math
> genius.

I don't consider myself a math genius, but I hope I can help you
anyway. But don't let me discourage you. It would be cool to
have a patent on an impossible process! Well, maybe it would only
be cool if you realized it was impossible but were able to convince
the patent office otherwise. In any case, if you could provide me
with a compressor and a decompressor, I would be happy to find the
exception for you, and I have no patent aspirations, so your secret
would be safe with me. =]

--- Sean

Sean Lambert sum...@mindless.com http://sum1els.home.ml.org/


graham...@hotmail.com

unread,
Dec 23, 1998, 3:00:00 AM12/23/98
to
In article <Pine.OSF.3.95.981223...@esus.cs.montana.edu>,
Sean Lambert <sum...@esus.cs.montana.edu> wrote:

[snip]

> I don't consider myself a math genius, but I hope I can help you
> anyway. But don't let me discourage you. It would be cool to
> have a patent on an impossible process! Well, maybe it would only
> be cool if you realized it was impossible but were able to convince
> the patent office otherwise. In any case, if you could provide me
> with a compressor and a decompressor, I would be happy to find the
> exception for you, and I have no patent aspirations, so your secret
> would be safe with me. =]
>
> --- Sean

Actually, the patent office has already granted patents for several impossible
compressors. Sorry I don't have links to references, but someone around here
does I think.

<rant>
It's unfortunate that the experts at the patent office sometimes can't be
specialized enough to verify stuff. In some cases, I'd imagine the top expert
in the field would be the patent applicant himself! But that's why you're
*supposed* to explain things clearly in the patent. Too bad all the legalese
sometimes makes the impossible indistinguishible from the possible.
</rant>

- GLYPH (gl...@mindless.com)

Sultan Hassan

unread,
Dec 24, 1998, 3:00:00 AM12/24/98
to
"Lynn Lantz" <lel...@bignet.net> wrote:


>It is lossless, works on any file type on any platform, is recursive (ie,

>the compressed file can be compressed again, and then again, and so on) so

>that ANY size file can be compressed to 16 bytes or less. I know this

>sounds like "perpetual motion" but it does work and I can prove it.

>I'm just looking for some advice as to where to go next with this.

>thanks.

Ok, say you can compress any file to 24 bytes.
So, you can have 16777216 different files of 24 bytes.
Now, since there exists much more than 16777216 files, how can you differentiate
two or more files which would have the same compressed image ?
How can your decoding algorithm make the difference between them ?
This seems pretty impossible to me.

| Hassan Sultan | University of Geneva|
| Computer science student| SWITZERLAND |
| sul...@cui.unige.ch |hsu...@infomaniak.ch|
|-------------------------|---------------------|
| http://www.infomaniak.ch/~hsultan |
|-----------------------------------------------|

Walter Roberson

unread,
Dec 24, 1998, 3:00:00 AM12/24/98
to
In article <3683b253...@news.sunrise.ch>,
Sultan Hassan <sul...@cui.unige.ch> wrote:

:"Lynn Lantz" <lel...@bignet.net> wrote:
:>It is lossless, works on any file type on any platform, is recursive (ie,
:>the compressed file can be compressed again, and then again, and so on) so
:>that ANY size file can be compressed to 16 bytes or less. I know this

:Ok, say you can compress any file to 24 bytes.


:So, you can have 16777216 different files of 24 bytes.
:Now, since there exists much more than 16777216 files, how can you differentiate
:two or more files which would have the same compressed image ?

You gave the number for 24 bits, not for 24 bytes. For 24 bytes, the
number is 6277101735386680763835789423207666416102355444464034512896.

Lynn argued earlier that the number of files that the system
encompasses is far greater than the number of files you could ever
*possibly* produce. 2^(24*8) is, however, only about 10^57.8
which is less than the estimated number of particles in the universe.
If you were to start with the first file and label the first particle
with that file content, label the second particle with the second file's
contents and so on, you would run out of files before you ran out of
particles. You need something on the order of 32 bytes just to
label all particles...

Midnight's Fire

unread,
Dec 24, 1998, 3:00:00 AM12/24/98
to
Actually I think you are incorrect...
For example.
suppose for a min that you could compress (losslessly) 33 bits to 31
bits... Then you have an added bit. (total 32) to tell the decomp to stop
or not. Assuming that you can do that 33 to 31 conversion everytime.
THERE IS NO LIMIT TO THE POSSIBILITIES that you can decomp to.


Sultan Hassan wrote in message <3682d648...@news.sunrise.ch>...
>robe...@ibd.nrc.ca (Walter Roberson) wrote:
>
>[my Big arithmetic mistake]


>>You gave the number for 24 bits, not for 24 bytes. For 24 bytes, the
>>number is 6277101735386680763835789423207666416102355444464034512896.
>>
>

>Ooops !
>You're right, my mistake...


>
>>Lynn argued earlier that the number of files that the system
>>encompasses is far greater than the number of files you could ever
>>*possibly* produce. 2^(24*8) is, however, only about 10^57.8
>>which is less than the estimated number of particles in the universe.
>>If you were to start with the first file and label the first particle
>>with that file content, label the second particle with the second file's
>>contents and so on, you would run out of files before you ran out of
>>particles. You need something on the order of 32 bytes just to
>>label all particles...
>

>The fact is that in theory you can have an infinite number of different
files, and
>reducing any file to a fixed size means that there will always be two files
which will
>have the same compressed image, and no deterministic decoding algorithm can
>take one compressed image and extract two different files from it.
>
>Practically, due to the limitations of filesystems, the number of different
files is limited
>to a very big number. However no one can say that two of these files will
not produce
>a same say 32 bytes compressed image.

Sultan Hassan

unread,
Dec 25, 1998, 3:00:00 AM12/25/98
to

Tom Lane

unread,
Dec 25, 1998, 3:00:00 AM12/25/98
to
"Midnight's Fire" <midn...@xilch.com> writes:
> suppose for a min that you could compress (losslessly) 33 bits to 31
> bits... Then you have an added bit. (total 32) to tell the decomp to stop
> or not. Assuming that you can do that 33 to 31 conversion everytime.
> THERE IS NO LIMIT TO THE POSSIBILITIES that you can decomp to.

It's a well-known fact in logic that if you start from a false premise,
you can "prove" false conclusions. Since your supposition is false on
its face, whatever you might deduce from it is untrustworthy.

Or, if you prefer: the above is circular reasoning. "If I had a
universal lossless compressor, I could make a universal lossless
compressor."

regards, tom lane

Sultan Hassan

unread,
Dec 25, 1998, 3:00:00 AM12/25/98
to
"Midnight's Fire" <midn...@xilch.com> wrote:

>Actually I think you are incorrect...
>For example.

>suppose for a min that you could compress (losslessly) 33 bits to 31
>bits... Then you have an added bit. (total 32) to tell the decomp to stop
>or not. Assuming that you can do that 33 to 31 conversion everytime.
>THERE IS NO LIMIT TO THE POSSIBILITIES that you can decomp to.

If your assumption is true this could be possible but:
assuming that you can ALWAYS compress 33bits to 31 means that you
could represent all numbers up to 2^33 with 31 bits which is clearly impossible.
31 bits can represent up to 2^31 values and not more.

Antaeus Feldspar

unread,
Dec 25, 1998, 3:00:00 AM12/25/98
to
Midnight's Fire <midn...@xilch.com> wrote:

> For general information...
>
> I have been working on a simular compression for years... And am employing a
> patent attorney. There is (as of yet) no patent on
> 1. Recursive BINARY compression
> 2. Recursive BYTE LEVEL compression
> 3. Recursive WAVEFORM compression
>
> So no danger of duplicating with anything that is close to 75% recursive...

[snip]

For good reason. Recursive compression means that the original
compression didn't exploit the redundancy of the file enough....

-jc

--
* -jc IS *NOW* feld...@cryogen.com
* Home page: http://members.tripod.com/~afeldspar/index.html
* The home of >>Failed Pilots Playhouse<<
* "Better you hold me close than understand..." Thomas Dolby

Midnight's Fire

unread,
Dec 26, 1998, 3:00:00 AM12/26/98
to
you are thinking in mathmatical terms...
compression doesn't have to be based on a mathmatical operation.
you can base it on geometry or 'thin'space .. . . ..
or simply patterns.


Sultan Hassan wrote in message <3684a83...@news.sunrise.ch>...


>"Midnight's Fire" <midn...@xilch.com> wrote:
>

Midnight's Fire

unread,
Dec 26, 1998, 3:00:00 AM12/26/98
to
Again Compression also soesn't have to reduce redundancy...

Antaeus Feldspar wrote in message
<1dkm73k.ko...@bos-ma7-07.ix.netcom.com>...


>Midnight's Fire <midn...@xilch.com> wrote:
>

>> For general information...
>>
>> I have been working on a simular compression for years... And am
employing a
>> patent attorney. There is (as of yet) no patent on
>> 1. Recursive BINARY compression
>> 2. Recursive BYTE LEVEL compression
>> 3. Recursive WAVEFORM compression
>>
>> So no danger of duplicating with anything that is close to 75%
recursive...

Sultan Hassan

unread,
Dec 26, 1998, 3:00:00 AM12/26/98
to
"Midnight's Fire" <midn...@xilch.com> wrote:

Those mathematical terms proove that it isn't possible to compress ALWAYS 33 bits
to 31, then prooving that it isn't possible to reduces all files to a fixed size and being
able to decompress them.
You can't go against the logic of mathematics, if it's not mathematically possible, it will
not be possible with any kind of geometry(which is indeed mathematics itself), thin space,...
Algorithms are mathematics and logic.


>you are thinking in mathmatical terms...
>compression doesn't have to be based on a mathmatical operation.
>you can base it on geometry or 'thin'space .. . . ..
>or simply patterns.
>
>
>Sultan Hassan wrote in message <3684a83...@news.sunrise.ch>...

[...]

Antaeus Feldspar

unread,
Dec 26, 1998, 3:00:00 AM12/26/98
to
Midnight's Fire <midn...@xilch.com> wrote:

> Again Compression also soesn't have to reduce redundancy...

No, not everything which qualifies as a compression algorithm
reduces redundancy.
For instance, take an input of length 1 Terabyte, calculated to
have as little exploitable redundancy as possible. A compression
algorithm could then construct a prefix code that expressed this single
output with a single bit. At least one possible input would then be
compressed, with no removal of redundancy.
However, note that there are problems with this form of
compression. A notable problem is that the decompressor would have to
have a literally-stored copy of that input file -- so if you were to try
sending the file as a self-extracting archive, it would *always* be
larger than the original, no matter how compact the code.
It's my understanding (and I could be wrong) that to construct
any compressor that can meet this second standard, it *is* necessary to
reduce redundancy.

Antaeus Feldspar

unread,
Dec 27, 1998, 3:00:00 AM12/27/98
to
Midnight's Fire <midn...@xilch.com> wrote:

> you are thinking in mathmatical terms...
> compression doesn't have to be based on a mathmatical operation.
> you can base it on geometry or 'thin'space .. . . ..
> or simply patterns.

Or topology. Or state-transition. Or cheese.

Sean Lambert

unread,
Dec 28, 1998, 3:00:00 AM12/28/98
to
On Wed, 23 Dec 1998 GLYPH gl...@mindless.com wrote:

> Actually, the patent office has already granted patents for several impossible
> compressors. Sorry I don't have links to references, but someone around here
> does I think.

Several are mentioned in the FAQ (www.faqs.org) in case anyone is
interested. (Time to read it again GLYPH? ;)

Piet van Oostrum

unread,
Dec 28, 1998, 3:00:00 AM12/28/98
to
>>>>> feld...@cryogen.com (Antaeus Feldspar) (AF) writes:

AF> Midnight's Fire <midn...@xilch.com> wrote:
>> you are thinking in mathmatical terms...
>> compression doesn't have to be based on a mathmatical operation.
>> you can base it on geometry or 'thin'space .. . . ..
>> or simply patterns.

AF> Or topology. Or state-transition. Or cheese.

Sponge is also very good.

Kevin Stone

unread,
Dec 31, 1998, 3:00:00 AM12/31/98
to
In article <3684a83...@news.sunrise.ch>, hsu...@infomaniak.ch
(Sultan Hassan) wrote:

>"Midnight's Fire" <midn...@xilch.com> wrote:
>

>>Actually I think you are incorrect...
>>For example.
>>suppose for a min that you could compress (losslessly) 33 bits to 31
>>bits... Then you have an added bit. (total 32) to tell the decomp to stop
>>or not. Assuming that you can do that 33 to 31 conversion everytime.
>>THERE IS NO LIMIT TO THE POSSIBILITIES that you can decomp to.
>

>If your assumption is true this could be possible but:
>assuming that you can ALWAYS compress 33bits to 31 means that you
>could represent all numbers up to 2^33 with 31 bits which is clearly
impossible.
>31 bits can represent up to 2^31 values and not more.

No that's not true. It is possible to represent exactly half of all
numbers from 0 to 2^33 in less than 33 bits. For example look at the
minimum bit length for all values from 0 to 2^8...

For values 0 - 255 8bits are required.
For values 0 - 127 7bits are required.
For values 0 - 63 6bits are required.
For values 0 - 31 5bits are required.
For values 0 - 15 4bits are required.
For values 0 - 7 3 bits are required.
For values 0 - 3 2 bits are required.
For values 0 - 1 1 bit is required.

Thus a string of all values from 000 - 255 could be represented in just
1792 bits rather than 2048. But since there is no way to designate the
bit length of each value in the encoding data structure without
eliminating your gains or growing the file, there is *no way to
decompress* the data. There in lies the impossibility.

-Kevin Stone
Stone Entertainment
www.StoneEntertainment.com
(no email please)

Walter Roberson

unread,
Jan 5, 1999, 3:00:00 AM1/5/99
to
In article <stone-31129...@rc-pm3-3-08.enetis.net>,
Kevin Stone <st...@stoneentertainment.com> wrote:
:No that's not true. It is possible to represent exactly half of all

:numbers from 0 to 2^33 in less than 33 bits. For example look at the
:minimum bit length for all values from 0 to 2^8...

:For values 0 - 255 8bits are required.

[...]
:For values 0 - 1 1 bit is required.

:Thus a string of all values from 000 - 255 could be represented in just
:1792 bits rather than 2048. But since there is no way to designate the
:bit length of each value in the encoding data structure without
:eliminating your gains or growing the file, there is *no way to
:decompress* the data. There in lies the impossibility.

Try it out on a small number of bits:

Decimal Binary

0 <empty>

1 0
2 1

3 00
4 01
5 10
6 11

7 000
8 001
9 010
10 011
11 100
12 101
13 110
14 111

15 0000
16 0001

and so on.


Notice that 0 up to (2^N - 2) can be represented in FEWER than N bits,
for a total of (2^N - 1) values. The 2^N'th value requires going
to the full N bits, so the Pigeon Hole Principle is not violated -- you
cannot uniquely store 2^N values in fewer than N bits.


You might complain that I have cheated by allowing the represented
value to depend upon the number of stored bits. I would, though, point
out that this is no more cheating than is differentiating between the
file 0x68 ('h') and the file 0x6869 ('hi'): either way you are
counting on external information about where the end-of-string is. To
store 'hi' in a real binary file requires -some- mechanism that knows
the file is two bytes long and returns the out-of-band indication 'End
of File' when the next byte is to be read. We can easily postulate a
filestore that counted bits instead of bytes and allowed bit-level
reading.


You might also complain that this scheme cannot be used to pack
multiple numbers together in the same file -- you could not, for
example, represent the string 5, 0, 2 without being at risk of
decompressing as 0, 2, 4 (amongst other possibilities.) You would be
correct, but you would also be irrelevant as the task at hand is to
represent a -single- number, not a series of numbers. A 'number' in
this context is a file, and we certainly do not ask of other
compressors such as LZW or Huffman encoding that they be able to
compress several files into a single file and do so without maintaining
any overhead marking the ends of the individual files.

Rickard Westman

unread,
Jan 9, 1999, 3:00:00 AM1/9/99
to
In article <76rskq$8u6$1...@canopus.cc.umanitoba.ca>, Walter Roberson wrote:
>In article <stone-31129...@rc-pm3-3-08.enetis.net>,
>Kevin Stone <st...@stoneentertainment.com> wrote:
>:No that's not true. It is possible to represent exactly half of all
>:numbers from 0 to 2^33 in less than 33 bits. For example look at the
>:minimum bit length for all values from 0 to 2^8...
>
>:For values 0 - 255 8bits are required.
>[...]
>:For values 0 - 1 1 bit is required.
>
>:Thus a string of all values from 000 - 255 could be represented in just
>:1792 bits rather than 2048. But since there is no way to designate the
>:bit length of each value in the encoding data structure without
>:eliminating your gains or growing the file, there is *no way to
>:decompress* the data. There in lies the impossibility.
>
>Try it out on a small number of bits:
[...]

>Notice that 0 up to (2^N - 2) can be represented in FEWER than N bits,
>for a total of (2^N - 1) values. The 2^N'th value requires going
>to the full N bits, so the Pigeon Hole Principle is not violated -- you
>cannot uniquely store 2^N values in fewer than N bits.
>
>You might complain that I have cheated by allowing the represented
>value to depend upon the number of stored bits. I would, though, point
>out that this is no more cheating than is differentiating between the
>file 0x68 ('h') and the file 0x6869 ('hi'): either way you are
>counting on external information about where the end-of-string is.

Yes, it's the same kind of cheating, or simplification, if you
like. If you really want to split hairs, you should count external
overhead as well, in both cases. Your encoding requires more
external overhead, since you require length information with bit
precision rather than byte precision. Add three bits of precision
to the total storage requirements, and your savings are gone.

Not counting out-of-band information doesn't normally affect
results much, but it complicates things, so it's a sensible
simplification as long as you don't abuse it. But if you want to
leverage the "savings" of a few bits by recursive compression, or
something similar (the original topic of this thread, I believe)
every bit counts. And when you count every bit, you also need to
count those bits arising from extra out-of-band requirements.

Applied to recursive decompression scenarios, a bit-level filestore
could tell you the length of the data in the first iteration. But
who would tell you the length in the second iteration, and later,
unless it was stored within the bit stream itself? "Saving" one
bit of data at the cost of requiring additional bits of data
out-of-band can not be part of a viable compression strategy. The
savings are only an illusion resulting from sloppy bit-counting.

--
Rickard Westman <ri...@ida.liu.se>

"Beware of the panacea peddlers: Just because you
wind up naked doesn't make you an emperor."
- Michael A Padlipsky

Manu

unread,
Jan 16, 1999, 3:00:00 AM1/16/99
to
> Try it out on a small number of bits:
>
> Decimal Binary
>
> 0 <empty>
>
> 1 0
> 2 1
>
> 3 00
> 4 01
> 5 10
> 6 11
>
> 7 000
> 8 001
> 9 010
> 10 011
> 11 100
> 12 101
> 13 110
> 14 111
>
> 15 0000
> 16 0001
>
> and so on.

How would you distinguish between 3, 03, 003, ...? I mean,
000 could be 03 or 7. Otherwise, 00 could be 3, 03, 003...

Anyway, that isn't the real reason for my post. I was watching the local
news last night (channel 2 denver, 9 pm) and saw part of a news item.
According to the newscaster, some 16 year old high school student has
come up with a coding scheme that will speed up internet access by 10
times or something similar! Supposedly this has generated lots of
interest among companies as well as universities! Unfortunately, I
didn't hear the full report, so I'm curious to know if anyone else heard
this. Thanks

manu


up with some coding mechanism which

Dimitri Smits

unread,
Jan 16, 1999, 3:00:00 AM1/16/99
to ksm...@worldnet.att.net
Manu wrote:
> Anyway, that isn't the real reason for my post. I was watching the local
> news last night (channel 2 denver, 9 pm) and saw part of a news item.
> According to the newscaster, some 16 year old high school student has
> come up with a coding scheme that will speed up internet access by 10
> times or something similar! Supposedly this has generated lots of
> interest among companies as well as universities! Unfortunately, I
> didn't hear the full report, so I'm curious to know if anyone else heard
> this. Thanks

I heard something, yeah ... on a local radiostation ... well ... local
:-)

I believe it had to do with using GPS (Global Positioning System)
for Internet trafic. It would see where the jams were, and take
evasive action. It would require a 'whole new' path-algo, that
would replace (or co-exist, I donno) the currently used algo's
that are used by routers.

It was also said (on this radiostation's newsflashes) that this
same algo could be applied to other media-application as well.

But nevertheless ... it does not really involve compression,
now does it? :-)

--
Greets,
Dimitri Smits,
Verantwoordelijk Uitgever Winak '98-'99

****** Student at RUCA University ***************************
Winak HomePage: http://dit.is/winak
or http://winak-www.uia.ac.be/winak
Green HomePage: http://studwww.rug.ac.be/~svvooren
e-mail: s94...@zorro.ruca.ua.ac.be
*************************************************************


Walter Roberson

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to
In article <36A0183D...@worldnet.att.net>,
Manu <ksm...@worldnet.att.net> wrote:
:[Walter Roberson, robe...@Ibd.nrc.ca wrote]:
:> Try it out on a small number of bits:

:> Decimal Binary
:> 0 <empty>
:> 1 0
:> 2 1

:> 15 0000
:> 16 0001

:How would you distinguish between 3, 03, 003, ...? I mean,


:000 could be 03 or 7. Otherwise, 00 could be 3, 03, 003...

000 cannot be 03 or 7, it can only be 7 in this context. You have made
the mistake of trying to concatenate together the encoding values,
which you never need to do in this context.


The task at hand [reprising] was to encode 2^N distinct values in fewer
than N bits. If you consider 3, 03, and 003 to be distinct values, then
you must be working in the context where these strings are part of a
larger set of enumerable strings. You would have an ordinal of enumeration
for each string. Map that ordinal to the binary value that I showed.

For example, your enumeration order might be 0, 1, 00, 2, 01, 000, 3, 02,
001, 0000, 4, 03, 002, 0001, 00000, etc. This is 15 distinct values so far,
and they can be represented by the mapping

string ordinal bin. encoding
0 0 <empty>
1 1 0
00 2 1
2 3 00
01 4 01
000 5 10
3 6 11
02 7 000
001 8 001
0000 9 010
4 10 011
03 11 100
002 12 101
0001 13 110
00000 14 111

so, (2^4 - 1) = 15 distinct values can be represented in fewer than 4 bits.


You cannot, though, extend this scheme in any way such that you could
fit a 16th value in fewer than 4 bits, so it is still true that you
can never compress 2^N values to fewer than N bits.

0 new messages