zlib compressbound equivalent?

588 views
Skip to first unread message

Dan Kortschak

unread,
Sep 12, 2012, 5:31:45 PM9/12/12
to golan...@googlegroups.com
Is there a compressbound equivalent in the gzip package? I have not been
able to find one.

thanks
Dan

agl

unread,
Sep 13, 2012, 10:58:19 AM9/13/12
to golan...@googlegroups.com
On Wednesday, September 12, 2012 5:31:56 PM UTC-4, kortschak wrote:
Is there a compressbound equivalent in the gzip package? I have not been
able to find one.

There is not one in the package. However, since it implements the deflate algorithm, we can assume that the bounds are the same as for zlib. From the zlib sources:

complen = sourceLen + ((sourceLen + 7) >> 3) + ((sourceLen + 63) >> 6) + 5

If wrapped in a gzip header, the bound needs an additional 18 bytes on top of that.


Cheers

AGL

Dan Kortschak

unread,
Sep 13, 2012, 8:08:46 PM9/13/12
to agl, golan...@googlegroups.com
That looks high. Which zlib version did you get that from?

1.2.7 has this:
sourceLen + (sourceLen >> 12) + (sourceLen >> 14) + (sourceLen >> 25) + 13

thanks
Dan

Adam Langley

unread,
Sep 14, 2012, 11:13:24 AM9/14/12
to Dan Kortschak, golan...@googlegroups.com
On Thu, Sep 13, 2012 at 8:08 PM, Dan Kortschak
<dan.ko...@adelaide.edu.au> wrote:
> That looks high. Which zlib version did you get that from?

It appears to be 1.2.5, but I also see the bound that you quoted at
the bottom of the function under this comment:

/* default settings: return tight bound for that case */

I found the bound that I quoted at the top of the function with this comment:

/* conservative upper bound for compressed data */

and it seemed safer to use that one.


Cheers

AGL

Dan Kortschak

unread,
Sep 14, 2012, 4:59:22 PM9/14/12
to Adam Langley, golan...@googlegroups.com
Thanks.

Dan Kortschak

unread,
Sep 14, 2012, 10:32:23 PM9/14/12
to Adam Langley, golan...@googlegroups.com
OK, compressBound is in compress.c and differs from deflateBound which
is in deflate.c in that compressBound only gives the less conservative
estimate and deflateBound uses the z_stream parameters to get a better
idea of what the constraint is going to be.

The code I'm porting from uses compressBound and this allows some
liberal assumption to be made about the input buffer size that using the
conservative constraint of deflateBound would not (the input buffer
would need to be smaller).

I'm not sure that this would necessarily cause a problem, but I'd rather
avoid it.

I don't know enough about how the Go standard library's flate
compression differs from the zlib compress algorithm (does zlib use
flate in compress.c?) to know which is safe.

Perhaps an approach is just to fill buffers and try compression with
various settings to see if I can get a failure (the package does check
after compression to catch these failures, but it would be nice to know
up front).

Any ideas?

thanks
Dan

Dan Kortschak

unread,
Sep 15, 2012, 7:36:09 PM9/15/12
to Dan Kortschak, Adam Langley, golan...@googlegroups.com
I've had a look at the zlib technical details page[1] and it gives a pretty good explanation of the relevant details. Can anyone clarify that flate.logWindowSize corresponds to zlib's windowBits? I can't find a constant in flate that corresponds to the second relevant parameter, memLevel. It looks to me like this is rolled silently into flate.hashBits at a value of 10 (hashBits is 17 and zlib defines the equivalent as memLevel + 7). Does this sound reasonable?

If these are correct, then it looks like flate compresses slightly harder than zlib using its default values, since from the technical details, I understand that increasing either of hashBits or windowBits should improve compression at the cost of runtime memory use.

Am I on the right track?

thanks
Dan

[1]http://www.zlib.net/zlib_tech.html
> --
>
>
Reply all
Reply to author
Forward
0 new messages