I'm posing this question because I'm in the process of improving the DC
archiver (that distance coder implementation). If you think we need
better algorithms, I'd love to see your arguments...
My personal opinion, is no, we don't really need better algorithms and
to demonstrate that I'll divide the question above in three areas:
1. lower memory requirements?
Compression algorithms tend to get only asymptotically better with an
exponential memory increase. This means on the reverse, once there's a
bunch of memory available on mainstream system (so you get near the
asymptote), there is no use trying to optimize the algorithm to use less
memory because that only improves compression slightly if at all. I'd
like to point out that this is indeed the case on the average
nowadays... If not, it'll be the case in a few years because memory size
tends to increase exponentially with time.
2. higher speed?
What can we still hope for? 200% speed increase without sacrificing
compression? That's irrelevant however, given the fact that mainstream
computer double speed roughly each 18 months... Also, lossless
compression is nowadays already quite fast.
3. better compression ratios?
Let's assume it is possible to get ratios twice as good as the current
ones. Would such an algorithm (although most likely revolutionary) be of
much use? I mean, I use lossles compression nowadays only for
transferring data over the internet, and that is also not really
imperative anymore since I got a high speed DSL internet conection (wich
also starts to become mainstream, at least over here in Germany).
An example for my arguments: compressing text is becoming quite obsolete
as I see it, except for a bit more exotic circumstances like google's
database (wich increases exponentially over time wich makes compression
ratios/speed important again as they save money for new hardware)
--
Edgar
[...]
>1. lower memory requirements?
>
>Compression algorithms tend to get only asymptotically better with an
>exponential memory increase. This means on the reverse, once there's a
>bunch of memory available on mainstream system (so you get near the
>asymptote), there is no use trying to optimize the algorithm to use less
>memory because that only improves compression slightly if at all. I'd
>like to point out that this is indeed the case on the average
>nowadays... If not, it'll be the case in a few years because memory size
>tends to increase exponentially with time.
There will be more and more embedded devices, not only Palm Pilots and
the like that are now becoming as fast as PCs were several years ago,
but also smaller devices that control part of a big system. They will
have relatively little memory, and if they have to transfer data,
compression algorithms with small memory requirements are useful.
>2. higher speed?
>
>What can we still hope for? 200% speed increase without sacrificing
>compression? That's irrelevant however, given the fact that mainstream
>computer double speed roughly each 18 months... Also, lossless
>compression is nowadays already quite fast.
Because speed itself is always relative to whatever hardware you are
using, I guess you really mean "lower CPU requirements", so, less
cycles to compress the same amount of data with the same compression
ratio. If you are running a web server that supports compressed
encoding and serves mostly dynamically created pages, you will be
happy to reduce the CPU load - it means you can do more with the same
machine.
If you have to screen large numbers of high resolution images you will
be glad if they decompress faster.
>3. better compression ratios?
>
>Let's assume it is possible to get ratios twice as good as the current
>ones. Would such an algorithm (although most likely revolutionary) be of
>much use? I mean, I use lossles compression nowadays only for
>transferring data over the internet, and that is also not really
>imperative anymore since I got a high speed DSL internet conection (wich
>also starts to become mainstream, at least over here in Germany).
Think of ISPs who buy bandwidth to sell it. If they transparently
compress data 10% better, they have 10% less bandwidth to pay for.
Or someone having millions of documents in a database - if you can
compress those documents 10% better with the same system, you'll need
less hardware.
>An example for my arguments: compressing text is becoming quite obsolete
>as I see it, except for a bit more exotic circumstances like google's
>database (wich increases exponentially over time wich makes compression
>ratios/speed important again as they save money for new hardware)
Google aren't the only people who have medium to large databases.
As you pointed out yourself by dividing requirements for compression
algorithms into different categories, there is no single algorithm for
all needs. Even small improvements in any of these categories will
work for at least some people.
Maybe you see compression too much from an end user perspective? It
really doesn't matter if you receive the installation version for some
program in 15 or 16 seconds, but there are other applications of
compression.
There are also many specialized compression algorithms - that work on
only bilevel images of text, or only on DNA sequences, etc. On these
fields there is probably even more room for improvement than with
general purpose compressors.
Besides, researchers are curious. They will always try to improve
existing systems, and sometimes knowledge will be gained that can be
used on different fields.
However, I think that there is little need for new archive file
formats. It's very inconvenient to have to install some new program
just because someone thinks ZIP isn't good enough to compress three
MBs worth of PDFs, and two percent can be gained from compressing with
something new.
Regards,
Marco
>
>My personal opinion, is no, we don't really need better algorithms and
>to demonstrate that I'll divide the question above in three areas:
>
>
Question is do we really need cars when walking is just as good.
Do we really need color vision. Hell we don't need a lot of things
so what. If you don't want better compression for a host of reasons
then don't work on it. For me the very fact it a challange is enough
especially for making better compressed files as a pre pass to
encryption.
David A. Scott
--
SCOTT19U.ZIP NOW AVAILABLE WORLD WIDE "OLD VERSIOM"
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Crypto code http://radiusnet.net/crypto/archive/scott/
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link"
Yes. To paraphrase wise words, a man's [person's] world is defined by how
he can travel in a day: From walking to flying the world got smaller as he
could travel farther in less time. A person's vision is limited to how far
they can see: a naked eye can only see so far where as a telescope can help
a person see far away. From yelling across a field to the telegraph to
high-compression data communications a person's insight is limited to know
much information they can aqcuire or have access to. I'm sure you know what
I'm going to say next if you are following this paragraph's pattern.
There are designs on drawing boards that will only be realized when
higher-compression algorithms are found. Food for thought: Biometrics is
growing by leaps and bounds. Pretty soon police will have wireless DNA
readers that need to transmit the entire DNA sequence to a computer
somewhere in real-time to obtain a person's identity... today wireless +
huge amounts of data = nightmare. Try doing that with Pkzip.
Ever seen Gattaca? :)
-G
>>There will be more and more embedded devices, not only Palm Pilots and
>>the like that are now becoming as fast as PCs were several years ago,
>>but also smaller devices that control part of a big system.
We hardly need better lossless algoritms for Palm Pilots and stuff. For
example
the pocket Pc now comes standard with 64 Mb. Most text are not compressed
on that device but expanded. (2 bytes for each char). Has anybody complained
about that no. Most space is taken by things which we would like to compress
with
some loss.
For better speed we do not need better lossless compression, because is no
info can be lost and we want to read it all the transmission time is
allready smal.
And bandthwidth will be consumed anyway.
Better for memory. Yes if we want to keep MORE information than we can
process
by our brains yes better compression might be handy, so we can hold the
Brittanica in a small device.
No we do not need better lossless compression.
It wil be welcomed but the gains are limited.
ben brugman
"Edgar Binder" <Edgar...@yahoo.de> wrote in message
news:3D3DB266...@yahoo.de...
I can see up to the moon an beyond, no I do not need te see any further than
I can see now. A telescope although a good instrument is not a tool I use
on a dayly basis.
There won't be a lot of progress on lossless compression compared to
progress on transmission speed and memory sizes. So if no better algoritm
for lossless compression is ever discouvered, we will hardly notice.
>
> There are designs on drawing boards that will only be realized when
> higher-compression algorithms are found. Food for thought: Biometrics is
> growing by leaps and bounds. Pretty soon police will have wireless DNA
> readers that need to transmit the entire DNA sequence to a computer
> somewhere in real-time to obtain a person's identity...
The DNA gnenome of only one person has not yet been aquired. Yess
they have most of it. To get a DNA profile in total of a criminal is not
possible yet. And for identification you only need a few parts to
get a good identification. The amount of data they can get now of one person
with limited time and limited money, can be send in a fraction of a second.
The DNA gnenome needed to id a person is small.
The complete DNA gnenome has only limited information compared
with a standard DNA gnenome. If standard DNA strings are available
on the recive site a persons DNA could be transmitted with probably
far less than 10 Megabyte. (The DNA gnenome is about a Gigabyte, but
most of it is the same as for each other human).
(These numbers come from my memory, and my memory doesn't function
very well as memories do in human).
>today wireless +
> huge amounts of data = nightmare. Try doing that with Pkzip.
All existing data in all of the world can be transmitted in a relative short
time,
if we didn't retransmit, retransmit, retransmit etc.
The total amount of data which is typed in all of the world, is little
compared
to the total amount of data which is transported through the internet each
day.
Most of the data transported each day is pictures (moving and other) and
sound. Lossless compression will not improve that. And they consume up
bandthwith because those 'files' are very often transported over the net.
(Take a million book, most books fit in a megabyte, so a million books fit
in a terrabyte, not a lot to todays standards. A million books is a lot
though).
ben brugman
I would say that it's all a matter of information. But Shannon told us
this already in 1948. All data contains information and some redundant
description of the information. In data compression we are aiming to
find a way to represent the data without having to describe the
redundancy.
This also means that if we are able to model an unknown source well
(i.e., compress it's output data well) we have also gained knowledge
about the source model and it's parameters. This gained knowledge about
a source may be the actual information that we are looking for!
/Nicklas
===
www.ekstrand.org