Zlib is a compression library used by Chromium code base and its dependencies (skia, libpng, pdfium, freetype, etc) for quite a few tasks ranging from image handling, loading extensions and accessing compressed content (i.e. Content-Encoding: gzip).
It is an impressive feat of engineering considering that it is 22 years old and it is used all over the place (e.g. linux kernel). Due to its history and the need to support long gone compilers and operating systems, its main focus has being portability than performance.
Since Chromium developers and users care about performance, it makes sense to have optimizations in the zlib used by Chromium. Since 2014 Chromium's zlib features Intel specific optimizations (e.g. optimized CRC, optimized hash function and fill_window), but up to this days it still has no optimizations targeting ARM processors used in mobile devices.
At January this year I noticed this issue and started working to address it, with the initial focus towards optimizing PNG image decoding.
Since maintaining a forked zlib isn't ideal, I performed some research about zlib alternatives (
https://goo.gl/ZUoy96) and tried to upstream with some degree of success the ARM specific optimizations (zlib-ng accepted the patches, canonical zlib still haven't reviewed them yet after 4 months). The initial golden goal was to achieve a scenario where Chromium wouldn't need to keep a forked zlib.
One first obstacle that had to be solved was the presence of multiple copies of zlib in Chromium code base (e.g. PDFium had its own zlib with patches applied on top of it). Fortunately PDFium has migrated to use chromium's zlib (i.e. third_party/zlib) and this is no longer an issue.
That being said, given that zlib-ng hasn't yet made an official release and its security status are unknown (i.e. new bugs?), it seems a bit too risky to migrate Chromium to it. On the other hand, canonical zlib doesn't seem interested in neither performance/security patches.
Until these external factors change (so we can revisit the issue), my intent is to land the ARM specific optimizations in zlib.
The patches are:
a) NEON implementation of Adler32 checksum:
https://chromium-review.googlesource.com/c/611492It should be about 2 to 3x times faster than the C implementation featured in zlib today. This should help on PNG image decoding.
b) Using the ARMv8 CRC32 instruction:
https://chromium-review.googlesource.com/c/612629Should be between 6x to 10x faster. This one should both help with image decoding as also other areas (e.g. gzipped content).
Since not all ARMv8 SoCs feature this instruction, I would love to hear from people who are familiar with how the Android apk is generated and distributed how we could enable the feature (i.e. this option has to be enabled at build time). Devices like Nexus 5x and Google Pixel will benefit from this change as they have the CRC32 instruction.
There are still other areas in zlib that we can optimize for ARM with good potential performance gains (e.g. fill_window, etc).
Best regards
Adenilson Cavalcanti