Why there is no auto-vectorization optimization in Go compiler?

Wei....@arm.com

unread,

Jul 31, 2017, 6:30:04 AM7/31/17

to golang-dev

Please correct me if i'm wrong.

In Go compiler, I don't find auto-vectorization optimization which has been implemented in both GCC and LLVM.

Is it the responsibility of each port to implement auto-vectorization optimization?

Or Go is not designed for parallel-operation-intensive applications which can be speed up by vector instructions?

Or any other concerns?

Thanks

Wei Xiao

David Chase

unread,

Jul 31, 2017, 10:23:00 AM7/31/17

to Wei....@arm.com, golang-dev

Haven't gotten around to it yet, and it's not high on our list of user complaints. It would be fun, but other stuff is more pressing.

Some of the obstacles (besides time to implement and debug it) include a need to vet the aliasing information

(Go's not Fortran, Go lacks a "restrict" keyword) and some uncertainty about how often it would actually be applicable.

I imagine the basic auto vectorization optimization would be generic across architectures, and follow the usual recipe of identifying induction variables and slice addressing using those induction variables. Machine-dependent is whether there are vector ops to care about (though software pipelining is also an option) whether their loads and stores carry extra alignment restrictions, and the vector sizes.

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chad Kunde

unread,

Aug 30, 2017, 4:24:59 AM8/30/17

to golang-dev

If you've got a commonly-vectorized loop, there's a set of hand-optimized loops in gonum/v1/gonum/floats that covers many common float64 vectorized loops. If not, what are you needing?

Wei....@arm.com

unread,

Sep 1, 2017, 4:09:02 AM9/1/17

to golang-dev

What I mean is a pass to transfer ordinary loop to commonly-vectorized loop and then emit vector instructions or call the functions as you mentioned.

Reply all

Reply to author

Forward