Scalar Replacement of Aggregates (SROA)

366 views
Skip to first unread message

Pepper Lebeck-Jobe

unread,
Nov 30, 2024, 10:37:52 PM11/30/24
to golan...@googlegroups.com
Dear golang-nuts,

Summary: Would we be open to adding SROA as a compiler optimization to the go compiler?

I recently discovered via this rathole that SROA is something that Clang does for the programming languages which compile with it. The implementation is here.

I also think that the lack of this compiler optimization is why there is a notable performance difference in the two benchmarks mentioned in https://github.com/golang/go/issues/49785

On the one hand, I have heard that the go team generally favors fast compilation times over compiler complexity and slower build times. On the other hand, I suspect that this optimization could really speed up a lot of go projects that exist in the wild.

Before really being surprised by the differences between the C and go performance in the bddicken/languages, I wouldn't have thought twice about writing a loop that aggregates values in a slice or an array in go. And, now that I've seen the performance difference, I'd much rather have the compiler optimize this for me than to go combing through my projects and manually allocating a local aggregation variable in the hopes of getting to use a register.

At this point, I'm just taking a "temperature" check to see if folks would entertain the idea. I haven't really studied compilers, and would probably need some guidance to successfully contribute it to gc.  But, I don't want to go through the learning and heavy-lifting if there's a major chance that the PR review would end up sounding like, "While it clearly produces more efficient binaries, we don't think it's worth the additional compilation time."

Let me know if I should "dive in" or just remember to be very careful when aggregating operations in loops to a location in memory.

Thanks,
Pepper 

Jason E. Aten

unread,
Dec 2, 2024, 1:24:19 AM12/2/24
to golang-nuts
Hi Pepper, 

since your question is about the opinion of the developers who work on the Go-compiler, it might get more attention over at the sibling group https://groups.google.com/g/golang-dev

I would add, if you can use tinygo, then you can use clang's optimizations today. 
https://tinygo.org/ has alot of limitations, but it is worth checking out.

Jason

Ian Lance Taylor

unread,
Dec 2, 2024, 2:39:20 PM12/2/24
to Pepper Lebeck-Jobe, golan...@googlegroups.com
As I understand it, the Go compiler already does scalar replacement of
aggregates for small structs and single element arrays. See
decomposeStructPhi and decomposeArrayPhi, in
https://go.googlesource.com/go/+/refs/heads/master/src/cmd/compile/internal/ssa/decompose.go.
I don't think anybody would object to extending the code if there were
minimal effect on compile time.

Ian
Reply all
Reply to author
Forward
0 new messages