Encoding png with sub-byte pixel formats

403 views
Skip to first unread message

Kevin Gillette

unread,
Oct 31, 2013, 8:11:14 PM10/31/13
to golan...@googlegroups.com
Is there any existing plan to extend image/png with support for _encoding_ 1, 2, and 4 bits per pixel of Gray and Paletted input images? I do a lot of work with monochrome images, and could get decent additional space savings if particularly input palettes were checked for length and sub-byte packing modes were implemented and selected for small input palettes, or if something like a Mono type were added to the image/color package, and recognized by the image/png encoder.

Nigel Tao

unread,
Oct 31, 2013, 11:25:46 PM10/31/13
to Kevin Gillette, golang-nuts
There are no plans.

How big are the savings, really? I'm just guessing, but given zlib
compression and filtering, I wouldn't expect e.g. a 4bpp PNG to be
half the size of the equivalent image encoded at 8bpp.

Kevin Gillette

unread,
Nov 1, 2013, 8:18:32 AM11/1/13
to golan...@googlegroups.com, Kevin Gillette
It can be substantial, though of course varies with compressibility (an image with large areas of a single solid color will compress well regardless of the bit-depth).

As an example, test_bw_raw.pbm inside of https://github.com/harrydb/go/tree/master/img/pnm/testdata, essentially an image containing random noise, when decoded from pbm using that library and then encoded using image/png, with no intermediate processing, produces a 56k PNG image; that pnm library decodes 1-bit input bitmaps to 8-bit gray. I submitted a pull request which instead decodes to a 2-color palette, which image/png encodes as an 8-bit-per-pixel palette, producing a file of 36k when compressed. An optimal 1-bit gray or 1-bit palette PNG of the same input should be about 28k. Note that this is a fairly small sample image; larger inputs containing hard-to-compress data would get more benefit out of any potential improvements.

Nigel Tao

unread,
Nov 1, 2013, 7:50:09 PM11/1/13
to Kevin Gillette, golang-nuts
On Fri, Nov 1, 2013 at 11:18 PM, Kevin Gillette
<extempor...@gmail.com> wrote:
> As an example, test_bw_raw.pbm inside of
> https://github.com/harrydb/go/tree/master/img/pnm/testdata, essentially an
> image containing random noise, when decoded from pbm using that library and
> then encoded using image/png, with no intermediate processing, produces a
> 56k PNG image; that pnm library decodes 1-bit input bitmaps to 8-bit gray. I
> submitted a pull request which instead decodes to a 2-color palette, which
> image/png encodes as an 8-bit-per-pixel palette, producing a file of 36k
> when compressed. An optimal 1-bit gray or 1-bit palette PNG of the same
> input should be about 28k. Note that this is a fairly small sample image;
> larger inputs containing hard-to-compress data would get more benefit out of
> any potential improvements.

For that particular 640x400 test image, it's hard to get excited about
an 8k saving. Larger images would obviously show bigger savings, but
what's the real world use case for generating many large, bi-color
images containing random noise? For non-random-noisy images, both ZLIB
and PNG filtering work on bytes, not bits, so stuffing multiple pixels
into a byte might actually be counter-productive (although I haven't
done the experiments to determine this).

To be honest, the PNG spec certainly allows for writing such an
encoding, but I'm not convinced yet that it would be used frequently
enough and save enough bytes to be worth complicating the image/png
code for it. I'm already unhappy enough with both the reader and
writer code being a giant case bash for the various color models.

For your specific use case, it might be best to just have the 100-odd
lines of custom code to encode 1-bit PNGs directly. The PNG format is
reasonably straightforward.

--------
package main

import (
"bufio"
"bytes"
"compress/zlib"
"hash/crc32"
"image/color"
"image/jpeg"
"log"
"os"
)

func writeUint32(b []uint8, u uint32) {
b[0] = uint8(u >> 24)
b[1] = uint8(u >> 16)
b[2] = uint8(u >> 8)
b[3] = uint8(u >> 0)
}

func writeChunk(w *bufio.Writer, b []byte, name string) error {
var (
header [8]byte
footer [4]byte
)

writeUint32(header[:4], uint32(len(b)))
header[4] = name[0]
header[5] = name[1]
header[6] = name[2]
header[7] = name[3]
crc := crc32.NewIEEE()
crc.Write(header[4:8])
crc.Write(b)
writeUint32(footer[:4], crc.Sum32())

_, err := w.Write(header[:])
if err != nil {
return err
}
_, err = w.Write(b)
if err != nil {
return err
}
_, err = w.Write(footer[:])
return err
}

func main() {
src, err := os.Open(os.Getenv("GOROOT") +
"/src/pkg/image/testdata/video-001.jpeg")
if err != nil {
log.Fatal(err)
}
defer src.Close()

dst, err := os.Create("out.png")
if err != nil {
log.Fatal(err)
}
defer dst.Close()

m, err := jpeg.Decode(src)
if err != nil {
log.Fatal(err)
}
b := m.Bounds()

w := bufio.NewWriter(dst)
defer w.Flush()

ihdr := [13]byte{}
writeUint32(ihdr[0:4], uint32(b.Dx()))
writeUint32(ihdr[4:8], uint32(b.Dy()))
ihdr[8] = 1 // bits per pixel
// ihdr[9:13] are zero to mean grayscale, default compression,
filter and interlacing.
if _, err = w.WriteString("\x89PNG\r\n\x1a\n"); err != nil {
log.Fatal(err)
}
if err = writeChunk(w, ihdr[:], "IHDR"); err != nil {
log.Fatal(err)
}
idat := &bytes.Buffer{}
zw := zlib.NewWriter(idat)

buf := make([]byte, 1+((b.Dx()+7)/8))
for y := b.Min.Y; y < b.Max.Y; y++ {
for i := range buf {
buf[i] = 0
}
index, shift := 1, uint(7)
for x := b.Min.X; x < b.Max.X; x++ {
c := color.GrayModel.Convert(m.At(x, y)).(color.Gray)
if c.Y >= 128 {
buf[index] |= 1 << shift
}
if shift != 0 {
shift--
} else {
index, shift = index+1, 7
}
}
zw.Write(buf)
}

zw.Close()
if err = writeChunk(w, idat.Bytes(), "IDAT"); err != nil {
log.Fatal(err)
}
if err = writeChunk(w, nil, "IEND"); err != nil {
log.Fatal(err)
}
}
--------

Kevin Gillette

unread,
Nov 2, 2013, 8:14:54 PM11/2/13
to golan...@googlegroups.com, Kevin Gillette
On Friday, November 1, 2013 5:50:09 PM UTC-6, Nigel Tao wrote:
On Fri, Nov 1, 2013 at 11:18 PM, Kevin Gillette
<extempor...@gmail.com> wrote:
> As an example, test_bw_raw.pbm inside of
> https://github.com/harrydb/go/tree/master/img/pnm/testdata, essentially an
> image containing random noise, when decoded from pbm using that library and
> then encoded using image/png, with no intermediate processing, produces a
> 56k PNG image; that pnm library decodes 1-bit input bitmaps to 8-bit gray. I
> submitted a pull request which instead decodes to a 2-color palette, which
> image/png encodes as an 8-bit-per-pixel palette, producing a file of 36k
> when compressed. An optimal 1-bit gray or 1-bit palette PNG of the same
> input should be about 28k. Note that this is a fairly small sample image;
> larger inputs containing hard-to-compress data would get more benefit out of
> any potential improvements.

For that particular 640x400 test image, it's hard to get excited about
an 8k saving. Larger images would obviously show bigger savings, but 
what's the real world use case for generating many large, bi-color
images containing random noise?

You can see that a bit in bilevel scans of paper pages containing color or grayscale images; it may sound odd, but there's a niche where faithful reproduction of non-textual content is unimportant; in some of these cases, compatibility constraints force use of certain formats, such as PNG. Such scans often have very large swaths of white, or large black blocks/bars (such as a horizontal rule in a letterhead), so hypothetically these should have decent compressibility despite the incidences of "noise".
 
For non-random-noisy images, both ZLIB
and PNG filtering work on bytes, not bits, so stuffing multiple pixels
into a byte might actually be counter-productive (although I haven't
done the experiments to determine this).

I need to do thorough experiments on this as well, and I agree that it may be counter-productive to sub-byte pack random pixels, since instead of bytes taking either 0 or 255 (which would be reduced to only two "symbols" when compressed), packed bytes could take on any of 256 values. One-byte-per-pixel encoding also probably has a much better likelihood of byte-aligned duplicate strings occurring above the detection threshold (3 bytes or greater in length) within a given 32kb block. Nonetheless, bit-packed bilevel png images, such as those produced by GIMP, are ending up smaller than non-packed png's produced by image/png, at about 26k whether the gimp png encoder  "compression level" is configured to 6 or 9.
 
For your specific use case, it might be best to just have the 100-odd
lines of custom code to encode 1-bit PNGs directly. The PNG format is
reasonably straightforward. 

That's perfectly reasonable.

Rob Pike

unread,
Nov 2, 2013, 8:27:37 PM11/2/13
to Kevin Gillette, golan...@googlegroups.com
I used to generate 2- and 4-bit GIFs from black and white drawings and
the resulting files would indeed be much smaller than the 8-bit
versions although the comparison is unfair since I was also throwing
away information.

-rob
Reply all
Reply to author
Forward
0 new messages