Reviewers: rsc,
Message:
Hello
r...@golang.org (cc:
da...@cheney.net,
golan...@googlegroups.com,
kra...@golang.org),
I'd like you to review this change to
https://go.googlecode.com/hg/
Description:
compress/flate: optimize history-copy decoding.
The naiveCopy function could be re-written in asm, and the copyHuff
method could probably be rolled into huffmanBlock and copyHist, but
I'm leaving those changes for future CLs.
compress/flate benchmarks:
benchmark old ns/op new ns/op
delta
BenchmarkDecoderBestSpeed1K 385327 431482
+11.98%
BenchmarkDecoderBestSpeed10K 1245190 1062026
-14.71%
BenchmarkDecoderBestSpeed100K 8512365 5838158
-31.42%
BenchmarkDecoderDefaultCompression1K 382225 422697
+10.59%
BenchmarkDecoderDefaultCompression10K 867950 615917
-29.04%
BenchmarkDecoderDefaultCompression100K 5658240 2477376
-56.22%
BenchmarkDecoderBestCompression1K 383760 422538
+10.10%
BenchmarkDecoderBestCompression10K 867743 615757
-29.04%
BenchmarkDecoderBestCompression100K 5660160 2478030
-56.22%
image/png benchmarks:
benchmark old ns/op new ns/op delta
BenchmarkDecodeGray 2540834 2462029 -3.10%
BenchmarkDecodeNRGBAGradient 10052700 9781590 -2.70%
BenchmarkDecodeNRGBAOpaque 8704710 8371630 -3.83%
BenchmarkDecodePaletted 1458779 1396961 -4.24%
BenchmarkDecodeRGB 7183606 6951584 -3.23%
Wall time for Denis Cheremisov's PNG-decoding program given in
https://groups.google.com/group/golang-nuts/browse_thread/thread/22aa8a05040fdd49
Before: 3.07s
After: 2.55s
Delta: -17%
Before profile:
Total: 304 samples
159 52.3% 52.3% 251 82.6%
compress/flate.(*decompressor).huffmanBlock
58 19.1% 71.4% 76 25.0%
compress/flate.(*decompressor).huffSym
32 10.5% 81.9% 32 10.5% hash/adler32.update
16 5.3% 87.2% 22 7.2% bufio.(*Reader).ReadByte
16 5.3% 92.4% 37 12.2%
compress/flate.(*decompressor).moreBits
7 2.3% 94.7% 7 2.3% hash/crc32.update
7 2.3% 97.0% 7 2.3% runtime.memmove
5 1.6% 98.7% 5 1.6% scanblock
2 0.7% 99.3% 9 3.0% runtime.copy
1 0.3% 99.7% 1 0.3%
compress/flate.(*huffmanDecoder).init
After profile:
Total: 253 samples
75 29.6% 29.6% 88 34.8%
compress/flate.(*decompressor).huffSym
55 21.7% 51.4% 55 21.7% compress/flate.naiveCopy
30 11.9% 63.2% 30 11.9% hash/adler32.update
24 9.5% 72.7% 194 76.7%
compress/flate.(*decompressor).huffmanBlock
17 6.7% 79.4% 33 13.0%
compress/flate.(*decompressor).moreBits
11 4.3% 83.8% 17 6.7% bufio.(*Reader).ReadByte
10 4.0% 87.7% 10 4.0% runtime.memmove
9 3.6% 91.3% 67 26.5%
compress/flate.(*decompressor).copyHist
7 2.8% 94.1% 7 2.8% hash/crc32.update
7 2.8% 96.8% 7 2.8% scanblock
Please review this at
http://codereview.appspot.com/6127064/
Affected files:
A src/pkg/compress/flate/copy.go
A src/pkg/compress/flate/copy_test.go
M src/pkg/compress/flate/inflate.go