TL;DR: What's the strategy behind using ARM64's CRC32C instructions in Chromium for Android? (Skia appears to be using them on Android without checking for their support, so we seem to assume that all ARM64 chips support them.)
The relevant part about the patch is that it gates the use of CRC32C instructions by a getauxval(AT_HWCAP) runtime check, which requires the <sys/auxv.h> header. I was able to build and run this code on Android in a standalone repository, using the latest Android NDK (r15c). However, I was not able to build the code in Chromium, as the <sys/auxv.h> header appears.
While looking for CRC32C consumers in Chrome, I found out that Skia uses ARM64's CRC32C accelerated instructions for its own hashing function. Assuming I understand the code correctly, Skia uses the instructions without any runtime check.
Does this mean I should follow suit and unconditionally use CRC32C instructions on ARM64 builds?
--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/eba5997b-2cc6-4572-833f-6596ab3b4b25%40chromium.org.
Hey VictorIt is nice to see more people looking for optimizations on ARM.:-)Since it is the dominant embedded/mobile platform, it makes sense to improve Chromium's performance on it. I will comment inline.TL;DR: What's the strategy behind using ARM64's CRC32C instructions in Chromium for Android? (Skia appears to be using them on Android without checking for their support, so we seem to assume that all ARM64 chips support them.)This is something I would like to know too.
From my post on blink-dev (https://goo.gl/pDGXHL), I got the understanding that Chrome apk distributed through the Google Play Store is an armv7 build (and chrome://version has "Official 32-bit build"). Anyone could confirm this?
Most of the flagship devices today have an ARMv8 SoC (Google pixel, Galaxy S8, LG G6, etc). Even an old Nexus 5x got it and newer and cheap devices will have it too (devices with an ARM Cortex A53 e.g. Nokia 6 and 3).
Which poses the question: anyone ever considered distributing an optimized build for those devices? (e.g. -march=armv8-a)? I don't have numbers but it is not hard to see some possible performance benefits.
The relevant part about the patch is that it gates the use of CRC32C instructions by a getauxval(AT_HWCAP) runtime check, which requires the <sys/auxv.h> header. I was able to build and run this code on Android in a standalone repository, using the latest Android NDK (r15c). However, I was not able to build the code in Chromium, as the <sys/auxv.h> header appears.This seems to point to proper support for the syscall in the latest NDK for Android. What I'm unsure is if the toolchain used by chromium for android is the latest?While looking for CRC32C consumers in Chrome, I found out that Skia uses ARM64's CRC32C accelerated instructions for its own hashing function. Assuming I understand the code correctly, Skia uses the instructions without any runtime check.That is interesting, I did some investigation about hash functions (https://bugs.chromium.org/p/chromium/issues/detail?id=735674#c8) and using the crc32 instruction was indeed faster on ARM, even though it could have a bit more collisions than other hashes (e.g. cityhash, highway hash, etc) for the specific test case I studied (i.e. ShapeCache & HashMap).A runtime check should be performed, as the instruction is optional on ARMv8-a and mandatory on ARMv8.1 (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0801g/awi1476352818103.html). As an example, IIRC the first iphone featuring an ARMv8 SoC didn't have the crc32 instruction (unsure if they now have it).Does this mean I should follow suit and unconditionally use CRC32C instructions on ARM64 builds?
Nopes, as explained before.
Another important detail is that you can use the instruction even on 32bits mode (as long the SoC supports it). As an example, this was fixed by a teammate@ARM in Skia (https://skia-review.googlesource.com/c/skia/+/15480).
Concerning runtime detection: assuming that it is possible to do the syscall, the same approach could be used as in the aforementioned LevelDB patch. On the other hand, what happens if the detection has to be done in a less privileged level (e.g. inside of the RendererProcess) for doing image decoding by a dependency (i.e. libpng uses zlib for decompressing IDAT segments)? Which, by the way, is a case that I'm interested: https://chromium-review.googlesource.com/c/chromium/src/+/612629
One alternative to the syscall (if that is indeed a limitation for the Chromium case), would be just to check /proc/cpuinfo as the information should be there. For the device I'm using it returns:
Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt lpae evtstrm aes pmull sha1 sha2 crc32
I can imagine that maybe we could hook into base/cpu.cc and worst case have an IPC call from the Renderer to the Browser process. Any thoughts?
Cheers Adenilson
--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev+unsubscribe@chromium.org.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev+unsubscribe@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/eba5997b-2cc6-4572-833f-6596ab3b4b25%40chromium.org.
--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/CAF8qwaAZb2ei2VGzZN_4sBmygBUQO8Cf-s53P4jf4D%3DQ%3D8gfMw%40mail.gmail.com.
getaux() doesn't use a syscall, but information provided by the kernel to the C library at process startup. It can be used in a renderer process.It is always available on Android/arm64, but that is not the case on Android/arm32 (only available since Android M, IIRC). Reading /proc/ will not work in renderer processes on certain devices, due to different kernel + SELinux configurations.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/CAP_mGKq1UV5PjqvfsU-opLZ11c8PvBbuGy1uzPf4RDnVyiSPgw%40mail.gmail.com.
Most of the flagship devices today have an ARMv8 SoC (Google pixel, Galaxy S8, LG G6, etc). Even an old Nexus 5x got it and newer and cheap devices will have it too (devices with an ARM Cortex A53 e.g. Nokia 6 and 3).
A runtime check should be performed, as the instruction is optional on ARMv8-a and mandatory on ARMv8.1 (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0801g/awi1476352818103.html). As an example, IIRC the first iphone featuring an ARMv8 SoC didn't have the crc32 instruction (unsure if they now have it).
Another important detail is that you can use the instruction even on 32bits mode (as long the SoC supports it). As an example, this was fixed by a teammate@ARM in Skia (https://skia-review.googlesource.com/c/skia/+/15480).
Skia does use CRC32 instructions, but only after checking for runtime support. Here's how we do it (from SkCpu.cpp):#elif defined(SK_CPU_ARM64) && __has_include(<sys/auxv.h>)#include <sys/auxv.h>static uint32_t read_cpu_features() {const uint32_t kHWCAP_CRC32 = (1<<7);uint32_t features = 0;uint32_t hwcaps = getauxval(AT_HWCAP);if (hwcaps & kHWCAP_CRC32) { features |= SkCpu::CRC32; }return features;}It's a pretty bad idea to use them without checking. Notably, no iDevices have CRC32 as far as I know. Frustratingly, Apple's Clang #defines the guard that indicates they do by default! We're forced to ignore them:// Really this __APPLE__ check shouldn't be necessary, but it seems that Apple's Clang defines// __ARM_FEATURE_CRC32 for -arch arm64, even though their chips don't support those instructions!#if defined(__ARM_FEATURE_CRC32) && !defined(__APPLE__)#define SK_ARM_HAS_CRC32#endif
On Saturday, August 26, 2017 at 6:48:19 PM UTC-4, Victor Costan wrote:While looking for CRC32C consumers in Chrome, I found out that Skia uses ARM64's CRC32C accelerated instructions for its own hashing function. Assuming I understand the code correctly, Skia uses the instructions without any runtime check.
--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/a6888cb0-f390-4434-a603-ccdb882a7662%40chromium.org.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/eba5997b-2cc6-4572-833f-6596ab3b4b25%40chromium.org.
--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
I'm unfamiliar about the details of that specific patch (besides the fact that is a hybrid approach using NEON instructions for loading data and the scalar crc32 instruction), I would have to have a look on it and understand if there
@David: thanks a lot for the explanation and the example provided is quite helpful.
It is quite interesting to know what are the primary concerns (e.g. binary size) for the Chrome apk.
Assuming that we were able to keep the same apk/binary size (assuming: target_os = "android", target_cpu = "arm", arm_version = 8, arm_arch = "armv8-a+crc"), what would be the threshold (% speedup) to justify the effort in *future* to provide a specialized build for new mobile devices?
Using the CRC32 specific instruction on ARMv8 yielded a boost of about 7% in the time for decoding PNGs. I haven't measured it yet, but it should also help with loading gzipped webpages (e.g. google, gmail, gcalendar, engadget, etc).
@Victor: Thanks for the kind words, I feel everyone wants a better and faster Chrome on their mobile devices.
Please see the comments inline:
>Do you happen to know if the LevelDB patch would work on a 32-bit ARM?I'm unfamiliar about the details of that specific patch (besides the fact that is a hybrid approach using NEON instructions for loading data and the scalar crc32 instruction), I would have to have a look on it and understand if there is anything there that could rely on AArch64 behavior (e.g. support for unaligned memory access). At a quick and first glimpse, it looks fine (I can also ask my colleague about it).
That being said, I personally tested the crc32 instruction in Chromium in both 32bits and 64bits mode in Chromium running in a Google Pixel (Qualcomm Snapdragon 820) and it works fine (as in https://chromium-review.googlesource.com/c/chromium/src/+/612629).
So I don't see that as a major problem. I can give the LevelDB patch a try and report back to you.
>If so, would you happen to know what --march= flag should be used for the intrinsics used by the patch (crc32c{b,h,w,d} >and vmull_p64)?Lets breakdown the instructions: the first is a scalar instruction (ARMv8.1 specific) while the second is a SIMD (NEON) instruction.
A bit of history: back when ARMv7 was released, support for NEON was optional (and IIRC there was a Tegra SoC that didn't have it). Later on, pretty much all SoCs started to have NEON support, but is not mandatory.
Therefore, you need to pass a flag (e.g. -mfpu=neon) to tell the compiler that your target has support for it.
In ARMv8 things are different: NEON support is mandatory. As a result, you don't need to pass to the compiler any flag to activate support for NEON, if your target is armv8 (e.g. -march=armv8-a). For chromium, I think NEON support is enabled by default in an arm build (e.g. target_cpu = "arm").
Issue is if you want to activate the crc32 instruction, then you got to tell the compiler about it (e.g. -march=armv8-a+crc). Depending on the compiler (gcc, clang) and version, the flags can vary.
Maybe an example can help, in the zlib upstream pull request (https://github.com/madler/zlib/pull/251/files#diff-af3b638bc2a3e6c650974192a53c7291R156). In that CMakefile, it will detect the compiler version and then pass the proper flag (it was tested with gcc 5.4 and 6.3 but not clang).
Just keep in mind that the oficial Chrome apk is not an ARMv8 specific target. At least for third_party/zlib, I didn't have to supply any specific flag to use NEON instructions (https://chromium-review.googlesource.com/c/chromium/src/+/611492/5/third_party/zlib/BUILD.gn) but had to identify if the architecture was armv8 with support for crc32 (https://chromium-review.googlesource.com/c/chromium/src/+/612629/3/third_party/zlib/BUILD.gn#69).
So back to the chromium case, you can enable it passing to 'gn args' arm_arch= "armv8-a+crc" (https://gist.github.com/Adenilson/29974397cea0ff159eb89f8fe2d1ddca).
Not sure if this is the 'recommended' way, though.
--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/c7573bf6-1fe1-4911-bc58-36ef8996e6bb%40chromium.org.
The only options are armeabi-v7a (which is what our current 32-bit APK already targets) and arm64-v8a (which our 64-bit APKs target, but we don't release those to stable as discussed).
--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/50ee6c2c-9e05-46bc-ab4b-ea169939f3de%40chromium.org.