Re: Is there any faster way to do zlib compress?

2,286 views
Skip to first unread message

bryanturley

unread,
Feb 4, 2013, 12:47:48 PM2/4/13
to golan...@googlegroups.com
Try the benchmark compressing over 20+MB, the overhead to compress that handful of bytes might be killing your test.

steve wang

unread,
Feb 4, 2013, 1:32:12 PM2/4/13
to golan...@googlegroups.com
io.Copy only use buffer with the fixed size of 32K bytes, which may in a way slow down your program if your data to be compressed is bulky. I think you can make use of bufio to improve your program's performance in this case.
I can't tell more without seeing your real code and its counterpart which is written in python.

On Monday, February 4, 2013 5:37:51 PM UTC+8, davy zhang wrote:
I write a simple zlib code using compress/zlib, but it's way too slow than the python version

I know python using c extension to do that

but encoding/json package has the same efficiency of python json module, but slower than UltraJson extension of python

So, is there anyway to make zlib faster? My project is heavily depends on this functionality

here is the test code:

package main


import (

"bytes"

"compress/zlib"


"fmt"

"io"


"time"

)






func main() {

times = 30000

var in, out bytes.Buffer

b := []byte(`{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia"],"test":{"prop1":1,"prop2":[1,2,3]}}`)

t1 = time.Now()

for i := 0; i < times; i++ {

w := zlib.NewWriter(&in)

w.Write(b)

w.Flush()

r, _ := zlib.NewReader(&in) 

io.Copy(&out, r)

in.Reset()

out.Reset()

}

fmt.Println(time.Since(t1))


}

Bryan Turley

unread,
Feb 4, 2013, 1:38:32 PM2/4/13
to davy zhang, golan...@googlegroups.com
This went offlist let me put it back on for context.

On Mon, Feb 4, 2013 at 11:58 AM, davy zhang <davy...@gmail.com> wrote:
Thanks for the advise.
I don't know if the size added up to 20+MB will produce a better result, but this test is based on my project situation. I use zlib to compress the network packets, not files. Seldom packets can be larger than 1MB. C zlib with python did a good job on this point


 
Most real network packets are smaller than 1500 bytes actually, what you
should do is use zlib (or other compressors) on a stream of data not individual packets.

20+MB is not a magic number, it is just compressing only 20-40 bytes is not going to get you as much as a longer stream.


在 2013年2月5日星期二UTC+8上午1时47分48秒,bryanturley写道:

Dave Cheney

unread,
Feb 4, 2013, 3:17:07 PM2/4/13
to steve wang, golan...@googlegroups.com
Please use that standard benchmark idiom when benchmarking. You can find examples in the compress/* packages themselves. 

Please provide the usual details about your hardware platform, os and go version. 

Please provide the python version for comparison so that others can reproduce your benchmark. 

Cheers

Dave
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

davy zhang

unread,
Feb 5, 2013, 11:59:01 AM2/5/13
to golan...@googlegroups.com, steve wang
I did this test on my macbook pro with i7 dual core and macos 10.8
the comparable python version is braindead simple

import time

import zlib

s = '{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia"],"test":{"prop1":1,"prop2":[1,2,3]}}'

st = time.time()

for i in xrange(times):

    zlib.decompress(zlib.compress(s))

et = time.time()

print "zlib:",et - st

python2.7 and golang 1.0.3

thanks for any advice

steve wang

unread,
Feb 5, 2013, 1:34:05 PM2/5/13
to golan...@googlegroups.com, steve wang
My profiling suggests that the performance of zlib need improvement on memory management.

davy zhang

unread,
Feb 6, 2013, 12:12:19 AM2/6/13
to golan...@googlegroups.com, steve wang
yep, after I posted this thread, I tried to wrap the zlib directly using CGO. I found the binary write/read is "very" expensive in golang. 

The code like this:

buf := new(bytes.Buffer)

// buf := bytes.NewBuffer(rawBytes) // this will improve a little bit

binary.Write(buf, binary.BigEndian, uint32(dstLen))

binary.Write(buf, binary.BigEndian, rawBytes)

will definitely slow codes down like 30% in my test.

if I use the pure cgo call to zlib compress/uncompress with some wrap, the performance is almost the same as python zlib

but when I try to store the orignal length of rawBytes for uncompress allocation using binary package

the code will significantly slower than the previous one.

the pure call one can result 438.452ms  the python version is 431ms, the binary header read/write version is 4694.922ms

So I guess the memory management or gc is bad performance under pressure

here is the sample code, note the commented code, I commented them for performance improve

func maxZipLen(nLenSrc int) int {

n16kBlocks := (nLenSrc + 16383) / 16384 // round up any fraction of a block

return (nLenSrc + 6 + (n16kBlocks * 5))

}


func Zip(src *[]byte) []byte {

srcLen := len(*src)

raw := unsafe.Pointer(&((*src)[0])) // change []byte to Pointer


memLen := C.size_t(maxZipLen(srcLen))

// fmt.Println("mem length is ", memLen)

dst := C.calloc(memLen, 1)

defer C.free(dst)

dstLen := C.ulong(memLen)

C.zcompress(dst, &dstLen, raw, C.ulong(srcLen))

//write the compressed length

rawBytes := C.GoBytes(dst, C.int(dstLen))

// buf := new(bytes.Buffer)

// buf := bytes.NewBuffer(rawBytes)

// binary.Write(buf, binary.BigEndian, uint32(dstLen))

// binary.Write(buf, binary.BigEndian, rawBytes)

// fmt.Printf("%02x\n",buf.Bytes())

// return buf.Bytes()

return rawBytes

}


func UnZip(src *[]byte, oriLen uint32) []byte {

srcLen := len(*src)


buf := new(bytes.Buffer)

buf.Write(*src)

// binary.Read(buf, binary.BigEndian, &oriLen)

// fmt.Println("original size found ", oriLen)


// rawBytes := make([]byte, oriLen)

// binary.Read(buf, binary.BigEndian, &rawBytes)

// ioutil.WriteFile("/tmp/go_compressed_inter", rawBytes, 0644)

// raw := unsafe.Pointer(&((rawBytes)[0])) // change []byte to Pointer

raw := unsafe.Pointer(&((*src)[0])) // change []byte to Pointer

// fmt.Println("mem length is ", oriLen)

dst := C.calloc(C.size_t(oriLen), 1)

defer C.free(dst)

dstLen := C.ulong(oriLen)

C.zuncompress(dst, &dstLen, raw, C.ulong(srcLen))

// fmt.Println("origLen after uncompressed", dstLen)


// fmt.Printf("%02x\n",buf.Bytes())

return C.GoBytes(dst, C.int(dstLen))

}

Dave Cheney

unread,
Feb 6, 2013, 12:14:44 AM2/6/13
to davy zhang, golan...@googlegroups.com, steve wang
> yep, after I posted this thread, I tried to wrap the zlib directly using
> CGO. I found the binary write/read is "very" expensive in golang.

cgo transitions are expensive.

> if I use the pure cgo call to zlib compress/uncompress with some wrap, the
> performance is almost the same as python zlib

That would make sense, both are having to translate from the Go/Python
to C environment

> So I guess the memory management or gc is bad performance under pressure

Don't guess, measure. GOGCTRACE=1 may be useful here

Dave

Sugu Sougoumarane

unread,
Feb 6, 2013, 1:08:27 PM2/6/13
to golan...@googlegroups.com, davy zhang, steve wang
We ran into the same issues with vitess. We have a cgo wrapper that's within 2% of the C library's performance for our tests. YMMV:

Nigel Tao

unread,
Feb 6, 2013, 7:38:57 PM2/6/13
to davy zhang, golang-nuts
On Mon, Feb 4, 2013 at 8:37 PM, davy zhang <davy...@gmail.com> wrote:
> I write a simple zlib code using compress/zlib, but it's way too slow than
> the python version

If you're on the stable release (Go 1.0.3), the upcoming Go 1.1
version should have a faster zlib.
https://codereview.appspot.com/6872063/ suggests an 1.5x improvement
in decompression on a MacBook Pro, for decent sized workloads.

Hamish Ogilvy

unread,
Feb 8, 2013, 7:17:38 PM2/8/13
to golan...@googlegroups.com, davy zhang, steve wang
Interesting. How about memory usage, have you done any measurements? I'm currently suspecting zlib leaves a huge memory footprint behind if lots of smaller files are compressed quickly, still analysing before posting some data though. 

I'll have a look at cgzip though. Thanks for the tip.

Regards,
Hamish


On Thursday, 7 February 2013 04:26:24 UTC+11, Mike Solomon wrote:
cgo transitions are not expensive enough to be an issue here.

We have the exact same issues with performance in Vitess, so we have the cgzip module which works just as you describe. It's performance is within ~2% of the C version IIRC.  I've dropped a link to the module if you want to just use it.

There are several issues contributing to the inefficiency of pure-Go zlib. They are all fixable, but if linking via CGO is an option, I would take that road for now.

Hamish Ogilvy

unread,
Feb 11, 2013, 10:07:03 PM2/11/13
to golan...@googlegroups.com, davy zhang, steve wang
Just a quick note to say i was wrong, memory footprint for zlib is fine. Don't listen to me...
Reply all
Reply to author
Forward
0 new messages