Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

12 views

Skip to first unread message

Apr 23, 1986, 7:40:39 PM4/23/86

to

# "Don't compress that dwarf -- hand me the pliers!" -- Watersign Theatre

(or, everything you always wanted to know about LZ algorithms but were ...)

The Lempel/Ziv 2-D compression work certainly merits respect for

its mathematical beauty. This in addition to some "meta-respect" the good

professors have received from the USENET community via the working LZ code

which has processed this very message.

(or, everything you always wanted to know about LZ algorithms but were ...)

The Lempel/Ziv 2-D compression work certainly merits respect for

its mathematical beauty. This in addition to some "meta-respect" the good

professors have received from the USENET community via the working LZ code

which has processed this very message.

I suspect the academic point Turkowski raises about difference coding

hinges upon the key words "finite-state". Coding the "repeat count" Ken

describes requires an arbitrary amount of memory for arbitrary-length input,

and thus is more suitable for a more powerful automaton. (Naturally this is

handwaving, but the interested reader is referred to thorough treatment in

Minsky's classic "Computation: Finite and Infinite Machines, Prentice-Hall,

1967). It's rather like the impossibility of multiplication on an FSA.

(see Minsky, section 2.6.) But don't let this stop you from implementing

a difference (or run-length) compressor -- the enormous number of states

in a typical computer are adequate to process *bounded-size* input such as

a bitmapped screen. Mathematical restrictions used for proofs in info.

theory journals are not reflective of what is obtainable with case-based

ad hoc-ery in the real world. So, sure, difference coding is recommended

for monotonic sequences, but then, its non-universality shows up as failure

on text.

To address another question posed by Ken [say, maybe it's time to

restart the defunct compress mailing list], the LZ method used by 'compress'

differs slightly from the LZ original, in that explicit data pointers are not

output. An enlightening comparison of LZ variants appears in an article by

Miller/Wegman (of IBM) in the NATO symposium "Combinatorial Algorithms on Words"

(ed. Apostolico, Springer-Verlag, 1986.) In particular, the "string extension"

variant yields somewhat better (a few percent) compression than does the

august 4.3 BSD utility. However, the data structures chosen (trie forests,

LRU replacement stacks) seemingly require a factor-of-several slowdown

in execution time. This would also apply to the newer splay-tree methods

of Tarjan/Bentley exhibited in a recent CACM article. (Too bad they shied away

from LZ comparisons -- they likely lose to LZ in space as well as time.)

Believe me, the speedups I did to get compress into its present state would

have much less impact if the combined compress/uncompress CPU time were

more than that of the less efficient pack/unpack duo.

Now a bit about LZ2D pragmatics: I programmed a simple Peano-Hilbert curve

image scanner shortly after looking at the paper preprint last year, thinking

that such a filter (in recursive C, though there are other ways (*) ) would be

perfect for piping into 'compress'. Alas, after compressing a 512x512

gray-scale image or two (from the "spaceman" series), I lost interest after

discovering file *expansion* (1d gave me about 30% for the image, and

2d only about 20%). It's conceivable I did something wrong, but it could

be that larger, or less "grainy" pictures are necessary for an LZ2D compressor

to shine in practice. This would be ironic, given that the 1D LZ algorithm

approaches its asymptotic optimum much sooner than does the Huffman method.

How LZ2D might deal with global structure is not yet clear to me, but

the curious might contact Mike Lee at !hplabs who worked with Lempel

during his sabbatical from Technion in Israel. Since HP is working on

proprietary compression hardware (disk controllers and whatnot), you (or I)

might not find out as much as desired.

-- ames!jaw@ames-aurora

(*) e.g., see "A New Algorithm for Generating Hilbert Curves" in the 1/86

Software Practice and Experience, a one-up of the cleverly subtitled

"Space-filling Curves, or how to Waste Time with a Plotter", which appeared

in the same forum 15 years earlier.

Apr 24, 1986, 8:52:08 PM4/24/86

to

In article <14...@ames.UUCP> j...@ames.UUCP (James A. Woods) writes:

>It's rather like the impossibility of multiplication on an FSA.

>(see Minsky, section 2.6.)

>It's rather like the impossibility of multiplication on an FSA.

>(see Minsky, section 2.6.)

It's results like the above that leave me skeptical of many information

theory results. I personally know of several automatons that can do

multiplication on two 16x16 numbers :-) and I'm sure others do too. I

have only found one issue (the March 1982 issue on quantization: get

your hands on it and xerox(tm) the whole thing) that I have found to be

useful in the IEEE transactions on IT.

> Now a bit about LZ2D pragmatics: I programmed a simple Peano-Hilbert curve

>image scanner ... Alas, after compressing a 512x512

>gray-scale image or two (from the "spaceman" series), I lost interest after

>discovering file *expansion* (1d gave me about 30% for the image, and

>2d only about 20%).

Skeptic though I am, I really believed that the Peano-Hilbert scan

could compress better than the "video scan". I'd like to believe that

James missed something (an event of measure zero), but I've been held

up in my own implementation because of a practical concern:

The LZ2D was designed to work only with square images with binary power

dimensions. Usually, I like to save images with arbitrary rectangular

dimensions, perhaps with an associated alpha mask to get

non-rectangular silhouettes. What is the best way to extend LZ2D to

rectangular images? Find the next larger square and pad it with

zeros? Find the next smaller square and break up the remainder into

smaller and smaller squares that can be P-H-scanned?

--

Ken Turkowski @ CIMLINC, Menlo Park, CA

UUCP: {amd,decwrl,hplabs,seismo}!turtlevax!ken

ARPA: turtlevax!k...@DECWRL.DEC.COM

0 new messages

Search

Clear search

Close search

Google apps

Main menu