Hi everyone,
I had another question while thinking about sparse matrix formats for RVV-targeted SpMV implementations.
Is CSR still generally considered the preferred storage format for SpMV on RISC-V Vector (RVV), or do formats like ELLPACK, COO, SELL-C-sigma, etc. tend to map better to RVV’s vector-length-agnostic execution model?
From what I understand, CSR is widely used because of its memory efficiency and generality, but I’m curious whether the irregular row lengths and indirect accesses become a bigger bottleneck on vector hardware specifically. On the other hand, formats like ELLPACK seem more vector-friendly due to their regular structure, even if they introduce padding overhead.
I would really like to understand how sparse format choices change when targeting RVV/vector architectures in practice, and what formats are typically preferred in production HPC or research implementations.
Thanks!
Thank you for such a detailed response, and congrats on the GSoC project!
The CSR → SELL-C-sigma progression makes a lot of sense now, I hadn't considered how irregular row lengths hurt RVV utilization despite its variable-length design. The point about using CSR as the interchange format and converting only when repeated SpMV justifies it is a really practical insight.
Are there any open implementations or papers on SELL-C-sigma you'd recommend as a starting point?
Got it! Kreutzer et al. paper: foundational reference for SELL-C-σ, and didn't know PETSc exposed a MATSELL type, that's a great concrete codebase to study alongside the theory. Anzt et al.: will check out the SELL-P work for the GPU-side perspective.
Really appreciate you taking the time!