We’ve been working on optimizing RISC-V for big data tools like Spark and Flink and noticed some bottlenecks (e.g., DWIO in Spark). On x86, these are optimized using BMI2’s PDEP & PEXT instructions. ARM doesn’t use similar SVE2 optimizations (like BDEP & BEXT), so ARM performance isn’t great. For RISC-V, we saw that the Zbe extension once included BDEP & BEXT(or bcompress & bdecompress) instructions, but they were dropped from the final B instruction set.