Image pre-processing before compression

Walter Dnes (delete the 'z' to get my real address)

unread,

May 3, 2004, 11:16:49 PM5/3/04

to

The following idea is aimed strictly at lossless image compression. I
start off with the assumption that horizontally adjacent pixels in an
image tend to be similar. This implies that the bit-pattern of adjacent
bytes tends to be similar. This, in turn implies that XOR'ing adjacent
bytes of the same base colour (e.g. every 3rd byte in a 24-bit RGB raster
image), will result in a disproportionately large number of bytes with
values of zero or near zero. This extra non-randomness should have some
impact on compressors.

So much for the theoretical hand-waving. How about an example? OK, I
downloaded from the site nic.funet.fi/pub/graphics/misc/test-images the
file "lena". According to the INDEX file, this is the green portion of
a 512 x 512 raster image. The size is 262944 bytes. This appears to
consist of 800 bytes of header data plus 512 x 512 = 262144 bytes of
image data. The first of 3 small C programs I wrote is "dist", which
counts up the number of bytes grouped by their value and prints the
distribution. dist.c follows...

#include <stdio.h>

int main(int argc, char *argv[])
{
unsigned long int accumulator, b1[256];
float percent;
int i, pixel;
FILE *input_file, *output_file;

if(argc != 3){
printf(" Correct usage:\n %s input_filename output_filename\n", argv[0]);
return 1;
}
if((input_file = fopen(argv[1], "rb")) == NULL){
printf("Error opening %s for input\n", argv[1]);
return 1;
}
accumulator = 0;
for (i=0; i<256; i++){ b1[i] = 0; }
while ((pixel = fgetc(input_file)) != EOF){
accumulator++;
b1[pixel]++;
}
fclose(input_file);
if((output_file = fopen(argv[2], "w")) == NULL){
printf("Error opening %s for output\n", argv[2]);
return 1;
}
for (i=0; i<256; i++){
percent = (long double)b1[i] / (long double)accumulator * 100;
fprintf(output_file, "Count of char[%d] = %lu = %Lf %% \n", i, b1[i], percent);
}
fclose(output_file);
return 0;
}

Compiling and executing the command...

./dist lena lena.dist

results in 256 lines of output to lena.dist, of which the first 17 are...

Count of char[0] = 24 = 0.009127 %
Count of char[1] = 5 = 0.001902 %
Count of char[2] = 5 = 0.001902 %
Count of char[3] = 15 = 0.005705 %
Count of char[4] = 69 = 0.026241 %
Count of char[5] = 114 = 0.043355 %
Count of char[6] = 167 = 0.063512 %
Count of char[7] = 264 = 0.100402 %
Count of char[8] = 312 = 0.118656 %
Count of char[9] = 434 = 0.165054 %
Count of char[10] = 540 = 0.205367 %
Count of char[11] = 685 = 0.260512 %
Count of char[12] = 849 = 0.322882 %
Count of char[13] = 915 = 0.347983 %
Count of char[14] = 1077 = 0.409593 %
Count of char[15] = 1353 = 0.514558 %
Count of char[16] = 1483 = 0.563998 %

The maximum count is char[96] = 2328 = 0.885360 %. That looks rather
random. Now for program xor. The file xor.c follows...

#include <stdio.h>

int main(int argc, char *argv[])
{
int char_previous, char_current, char_out;
FILE *input_file, *output_file;

if(argc != 3){
printf(" Correct usage:\n %s input_filename output_filename\n", argv[0]);
return 1;
}
if((input_file = fopen(argv[1], "rb")) == NULL){
printf("Error opening %s for input\n", argv[1]);
return 1;
}
if((output_file = fopen(argv[2], "wb")) == NULL){
printf("Error opening %s for output\n", argv[2]);
return 1;
}
char_previous = 0;
while ((char_current = fgetc(input_file)) != EOF){
char_out = char_previous ^ char_current;
char_previous = char_current;
fputc(char_out, output_file);
}
fclose(input_file);
fclose(output_file);
return 0;
}

The program outputs the first byte input verbatim. All subsequent
output consists of the xor of the current input byte and the previous
(*NOT* xor'd) input byte. The command...

xor lena lena.xor

produces the file lena.xor. Before we go any further, how do we get
back from the XOR'd version to the original version? The program unxor
does it. unxor.c follows...

#include <stdio.h>

int main(int argc, char *argv[])
{
int char_previous, char_current, char_out;
FILE *input_file, *output_file;

if(argc != 3){
printf(" Correct usage:\n %s input_filename output_filename\n", argv[0]);
return 1;
}
if((input_file = fopen(argv[1], "rb")) == NULL){
printf("Error opening %s for input\n", argv[1]);
return 1;
}
if((output_file = fopen(argv[2], "wb")) == NULL){
printf("Error opening %s for output\n", argv[2]);
return 1;
}

char_previous = 0;
while ((char_current = fgetc(input_file)) != EOF){
char_out = char_previous ^ char_current;
char_previous = char_out;
fputc(char_out, output_file);
}
fclose(input_file);
fclose(output_file);
return 0;
}

The program outputs the first byte input verbatim. All subsequent
output consists of the xor of the current input byte and the previous
output byte. The command...

xor lena.xor lena.xor.unxor

generates the file lena.xor.unxor, which diff says is identical to lena.
Back to lena.xor. Running dist against lena.xor...

dist lena.xor lena.xor.dist

gives us lena.xor.dist, of which the first 17 lines are...

Count of char[0] = 17180 = 6.533710 %
Count of char[1] = 16793 = 6.386531 %
Count of char[2] = 15084 = 5.736583 %
Count of char[3] = 15007 = 5.707299 %
Count of char[4] = 11643 = 4.427939 %
Count of char[5] = 11740 = 4.464829 %
Count of char[6] = 11518 = 4.380401 %
Count of char[7] = 11552 = 4.393331 %
Count of char[8] = 5702 = 2.168523 %
Count of char[9] = 5727 = 2.178030 %
Count of char[10] = 6236 = 2.371608 %
Count of char[11] = 6278 = 2.387581 %
Count of char[12] = 7012 = 2.666728 %
Count of char[13] = 7077 = 2.691447 %
Count of char[14] = 7126 = 2.710083 %
Count of char[15] = 7241 = 2.753818 %
Count of char[16] = 1342 = 0.510375 %

This looks promising. So, let's compare some compression results to
check the impact. First, let's try the plain lena file, and maximum
compression for compress, bzip2, gzip, and zip.

-rw-r--r-- 1 user2 users 262944 May 1 21:59 lena
-rw-r--r-- 1 user2 users 245423 May 1 21:59 lena.Z
-rw-r--r-- 1 user2 users 184509 May 1 21:59 lena.bz2
-rw-r--r-- 1 user2 users 234543 May 1 21:59 lena.gz
-rw-r--r-- 1 user2 users 234660 May 2 17:22 lena.zip

Now let's see how lena.xor compresses

-rw-r--r-- 1 user2 users 262944 May 2 17:11 lena.xor
-rw-r--r-- 1 user2 users 228149 May 2 17:11 lena.xor.Z
-rw-r--r-- 1 user2 users 199451 May 2 17:11 lena.xor.bz2
-rw-r--r-- 1 user2 users 197699 May 2 17:11 lena.xor.gz
-rw-r--r-- 1 user2 users 197820 May 2 17:28 lena.xor.zip

bzip2 seems rather unhappy, but the others improve to varying degrees.
How do the heavy-duty compressors like PAQ-6 and PPM react? Does this
help at all, or does it merely duplicate something they already do?

To give an idea of my skill level, I don't have a degree, and I'll
admit right now that most of the equations tossed around on this forum
are over my head. Also, C is not my "native language" in programming,
which is why the robot-like style. I figure if it compiles without
warnings under "-ansi -pedantic" in gcc, I'm doing OK.

I wouldn't be surprised if this algorithm has been figured out before.
In the unlikely event that it hasn't, I hereby donate it to the public
domain.

--
Walter Dnes; my email address is *ALMOST* like wzal...@waltdnes.org
Delete the "z" to get my real address. If that gets blocked, follow
the instructions at the end of the 550 message.

Thomas Richter

unread,

May 4, 2004, 4:10:29 AM5/4/04

to

Hi Walter,

> The following idea is aimed strictly at lossless image compression. I
> start off with the assumption that horizontally adjacent pixels in an
> image tend to be similar. This implies that the bit-pattern of adjacent
> bytes tends to be similar. This, in turn implies that XOR'ing adjacent
> bytes of the same base colour (e.g. every 3rd byte in a 24-bit RGB raster
> image), will result in a disproportionately large number of bytes with
> values of zero or near zero. This extra non-randomness should have some
> impact on compressors.

Using similarities in neighbouring pixels for lossless image compression
is of course a very old idea. XORing the pixel values to get something
more compressible does work to some degree, but is not the optimal algorithm
to do that. Just consider what a "most compressible" image would look like:
Those where neighbouring pixels always tend to give the "same" xor'd value.
Is this a "natural" description of a natural image? Maybe not.

The next idea one can get is just to subtract the pixel values instead
of xor'ing them: This is a reversible operation as well. Images are
very compressible if the pixel differences tend to be all the same or
all very small: Thus "grey ramps" are the territory of this algorithm,
and thus, "it works" to some (better) degree (but it's far from
optimal). What we have here is what I'd call the highpass of the Haar
wavelet.

The next better prediction takes surrounding pixels from the pixel to be
compressed and makes a prediction about the current pixel, then storing only
the prediction error in the compressed file. The prediction schemes can be
pretty complex; just let me say that PNG and JPEG-LS work this way, and if
you like, wavelet compression also works this way (though the predictors
are different). The above "take pixel differences" is already a wavelet
highpass, though the simplest possible.

Thus, you share the basic idea with today's top image compression schemes,
just the method requires some improvement (XOR'ing is not a particular good
predictor). (-:

So long,
Thomas

Guido Vollbeding

unread,

May 4, 2004, 4:42:12 AM5/4/04

to

Thomas Richter wrote:
>
> The next better prediction takes surrounding pixels from the pixel to be
> compressed and makes a prediction about the current pixel, then storing only
> the prediction error in the compressed file. The prediction schemes can be
> pretty complex; just let me say that PNG and JPEG-LS work this way, and if
> you like, wavelet compression also works this way (though the predictors
> are different). The above "take pixel differences" is already a wavelet
> highpass, though the simplest possible.

Well, my impression is that all those given prediction schemes are rather
naive and suboptimal. You can see this from the fact that multiple prediction
choices are possible, and selection of the "optimal" predictor is outside
the scope of specification (that is the case at least for PNG and the
lossless JPEG standard). So there seems to be no *real* solution.

> The next idea one can get is just to subtract the pixel values instead
> of xor'ing them: This is a reversible operation as well. Images are
> very compressible if the pixel differences tend to be all the same or
> all very small: Thus "grey ramps" are the territory of this algorithm,
> and thus, "it works" to some (better) degree (but it's far from
> optimal). What we have here is what I'd call the highpass of the Haar
> wavelet.

Well, the major reason why it is suboptimal is that it is only 1-D.
For images, which are 2-D, you would expect that an optimal scheme would
take into account the 2-D image structure, that is, to take not only
pixels from the same line in prediction, but also adjacent pixels
from adjacent line(s).

Having said all that, I think that the "optimal" scheme (prediction)
for lossless image compression is still to be found.
I have a particular idea in mind which is still to investigate...

Regards
Guido

Matt Mahoney

unread,

May 4, 2004, 9:57:37 AM5/4/04

to

"Walter Dnes (delete the 'z' to get my real address)"
<wzal...@waltdnes.org> wrote in message
news:c771uv$evcf$1...@ID-146822.news.uni-berlin.de...

> The following idea is aimed strictly at lossless image compression. I
> start off with the assumption that horizontally adjacent pixels in an
> image tend to be similar. This implies that the bit-pattern of adjacent
> bytes tends to be similar. This, in turn implies that XOR'ing adjacent
> bytes of the same base colour (e.g. every 3rd byte in a 24-bit RGB raster
> image), will result in a disproportionately large number of bytes with
> values of zero or near zero. This extra non-randomness should have some
> impact on compressors.

Instead of XOR, try subtraction modulo 256. Then try a second pass where
you subtract the pixel above insted of the pixel to the left (this would be
256 bytes back in the file) The result will be an image where most of the
pixels are near 0. This should be highly compressible. You might then try
adding 128 to the pixel values so the peak of the distribution is centered
in the range 0-255. All of these operations are reversible.

I admit that I haven't tried this with PAQ and it might improve the
compression. PAQ doesn't explicitly model .bmp files in 2 dimensions (yet)
but it does have a model that attempts to find repeating characters at fixed
intervals, so it sometimes discovers the image width and models the byte
above it as context. Some of the other top compresssors probably have
prefilters that get the image width from the file header and then perform
image differencing like I described, although I can't say because they
aren't open source.

-- Matt Mahoney

Nils

unread,

May 4, 2004, 6:52:12 PM5/4/04

to

Prediction in JPEG-LS is a quite interesting technique.

It uses these pixels

C B D
A X

Where X is the current pixel and ABCD are pixels in the neighbourhood (to
left and above) that were already encoded, so can be used. The prediction is
2-dimensional, trying to predict the value for X.

The prediction is firstly done based on ABCD alone. Then, it is going to
look for a "history" of predictions for a particular combination of ABCD,
called context. If there is a history, the algorithm tries to predict the
bias of the prediction.

I tested the above and adapted it for my own format "ISA", and although it
gives gray hairs when trying to implement, and numerous slightly wrong
implementations that do not work, I finally managed to get it to work and
give me a very decent extra compression from this step compared to without
context (about 10%).

It is still not state of the art, but performs remarkably well for
8bit/channel photographic images, even with some noise present.

By the way, even the standard prediction in JPEG-LS (so without the context)
is already better than the Paeth predictor in PNG, according to my findings.

More detailed information can be found in this PDF document:
http://www.hpl.hp.com/loco/HPL-98-193R1.pdf

Nils
www.simdesign.nl

"Guido Vollbeding" <gu...@jpegclub.org> wrote in message
news:40975764...@jpegclub.org...

Walter Dnes (delete the 'z' to get my real address)

unread,

May 5, 2004, 6:23:43 AM5/5/04

to

On Tue, 04 May 2004 13:57:37 GMT, Matt Mahoney, <matma...@yahoo.com> wrote:

> Instead of XOR, try subtraction modulo 256. Then try a second pass where
> you subtract the pixel above insted of the pixel to the left (this would be
> 256 bytes back in the file) The result will be an image where most of the
> pixels are near 0. This should be highly compressible. You might then try
> adding 128 to the pixel values so the peak of the distribution is centered
> in the range 0-255. All of these operations are reversible.
>
> I admit that I haven't tried this with PAQ and it might improve the
> compression. PAQ doesn't explicitly model .bmp files in 2 dimensions (yet)
> but it does have a model that attempts to find repeating characters at fixed
> intervals, so it sometimes discovers the image width and models the byte
> above it as context.

I downloaded paq6v2.cpp and built it with...

g++ -O3 -march=i686 -funroll-loops -fomit-frame-pointer paq6v2.cpp -o paq6v2

Anyhow, xor'ing actually hurts paq6. The results for default mode
were as follows

-rw-r--r-- 1 user2 172388 May 5 05:28 lena.pq6
-rw-r--r-- 1 user2 181344 May 5 05:28 lena.xor.pq6

If and when I manage to improve things, I'll get back to this group.

rep_movsd

unread,

May 6, 2004, 3:55:21 PM5/6/04

to

"Matt Mahoney" <matma...@yahoo.com> wrote in message news:<ljNlc.4650$8S1...@newsread2.news.atl.earthlink.net>...

> Instead of XOR, try subtraction modulo 256. Then try a second pass where
> you subtract the pixel above insted of the pixel to the left (this would be
> 256 bytes back in the file) The result will be an image where most of the
> pixels are near 0. This should be highly compressible. You might then try
> adding 128 to the pixel values so the peak of the distribution is centered
> in the range 0-255. All of these operations are reversible.
>
> I admit that I haven't tried this with PAQ and it might improve the
> compression. PAQ doesn't explicitly model .bmp files in 2 dimensions (yet)
> but it does have a model that attempts to find repeating characters at fixed
> intervals, so it sometimes discovers the image width and models the byte
> above it as context. Some of the other top compresssors probably have
> prefilters that get the image width from the file header and then perform
> image differencing like I described, although I can't say because they
> aren't open source.
>
> -- Matt Mahoney

I have been experimenting some and one approach that I found was to
rearrange the pixels of an image by traversing the image along the
path of a Hilbert-Banach space filling curve, which tends to bring
spatially close pixels together and then applying a delta filter. It
looks really good on paper , however it fails in practice :(

The images I used (256 x 256) had screen shots and photographs.
Even visually, the reordered images look more even( read compressible
) but compressing the images using PPMD results in several percent
worse than the original( both with and without the delta coding).

Maybe it might work better for small images or with PAQ ?

Walter Dnes (delete the 'z' to get my real address)

unread,

May 8, 2004, 3:40:21 PM5/8/04

to

On Tue, 04 May 2004 13:57:37 GMT, Matt Mahoney, <matma...@yahoo.com> wrote:

> Instead of XOR, try subtraction modulo 256. Then try a second pass where
> you subtract the pixel above insted of the pixel to the left (this would be
> 256 bytes back in the file) The result will be an image where most of the
> pixels are near 0. This should be highly compressible. You might then try
> adding 128 to the pixel values so the peak of the distribution is centered
> in the range 0-255. All of these operations are reversible.
>
> I admit that I haven't tried this with PAQ and it might improve the
> compression. PAQ doesn't explicitly model .bmp files in 2 dimensions (yet)
> but it does have a model that attempts to find repeating characters at fixed
> intervals, so it sometimes discovers the image width and models the byte
> above it as context. Some of the other top compresssors probably have
> prefilters that get the image width from the file header and then perform
> image differencing like I described, although I can't say because they
> aren't open source.

I've done some experimenting with delta-filtering the past few days,
but no luck. I noticed one immediate area for a slight improvement. If
you have a 512x512 image (lena), delta against the previous pixel will
not optimize the leftmost pixel. With 512 scanlines, that's 512 pixels.
Similarly, delta against the pixel immediately above the current pixel
will miss optimizing the entire top scanline. I tried optimizing the
delta process .ke so..
- read in the first byte verbatim
- delta the next 799 bytes against the previous byte
(We've now taken care of the 800-byte header)
- repeat 512 times the next two points
- delta the leftmost pixel of each scan-line against the pixel
immediately above it
- delta the remaining 511 pixels in that scan-line against the
pixel immediately to the left of it.

This deltas every pixel in the image except the top-leftmost. The
results were noticebly better than XOR'ing. gzip and zip were able to
compress the delta'd file as well as bzip2 handled the original file.
Unlike XOR'ing, delta'ing actually resulted in an improvement of a few
kbytes for both bzip2 and paq6...

-rw-r--r-- 1 user2 262944 May 1 21:59 lena
-rw-r--r-- 1 user2 184509 May 1 21:59 lena.bz2
-rw-r--r-- 1 user2 234543 May 1 21:59 lena.gz
-rw-r--r-- 1 user2 172388 May 5 19:24 lena.pq6
-rw-r--r-- 1 user2 234660 May 6 01:31 lena.zip

-rw-r--r-- 1 user2 262944 May 7 21:54 lena.delta_2
-rw-r--r-- 1 user2 181978 May 7 21:54 lena.delta_2.bz2
-rw-r--r-- 1 user2 184615 May 7 21:54 lena.delta_2.gz
-rw-r--r-- 1 user2 167758 May 7 22:00 lena.delta_2.pq6
-rw-r--r-- 1 user2 184740 May 7 21:58 lena.delta_2.zip

I thought I was on to something, but my hopes were dashed when I
switched from default (paq -3) to paq -4 ...

-rw-r--r-- 1 user2 166900 May 7 22:05 lena4.pq6
-rw-r--r-- 1 user2 166460 May 7 22:07 lena4.delta_2.pq6

So much for that idea. Of course, my 5-year-old Dell with 128 megs
of RAM isn't exactly a candidate for trying higher levels than paq -4.
But I assume that the higher compression levels will wipe out any
remaining advantage of delta'ing images. Back to the drawing board.

Matt Mahoney

unread,

May 8, 2004, 7:35:07 PM5/8/04

to

"Walter Dnes (delete the 'z' to get my real address)"
<wzal...@waltdnes.org> wrote in message

news:2g4rd4F...@uni-berlin.de...

> I thought I was on to something, but my hopes were dashed when I
> switched from default (paq -3) to paq -4 ...
>
> -rw-r--r-- 1 user2 166900 May 7 22:05 lena4.pq6
> -rw-r--r-- 1 user2 166460 May 7 22:07 lena4.delta_2.pq6

The delta coding duplicates some modeling done by paq6 for 24 bit images
that is turned on when using the -4 option or higher. One model is a record
model that discovers the image width (sometimes) and models the current
pixel (byte) in a 2 byte context, one byte left (actually another color
component 2/3 of the time) and the pixel above. The other is part of the
analog model, which has the following two contexts:

xxxx.... ........ ........ xxxxxxxx ........ ........ ? (position mod 3)
xxxxxx.. xxxx.... xxxx.... ? (position mod 3)

where (?) is the byte being modeled, and x represents bits which form part
of the context. Only the high order bits of the previous pixels are used
with preference for the same color.

-- Matt Mahoney