Haven't gotten around to it yet, and it's not high on our list of user complaints. It would be fun, but other stuff is more pressing.
Some of the obstacles (besides time to implement and debug it) include a need to vet the aliasing information
(Go's not Fortran, Go lacks a "restrict" keyword) and some uncertainty about how often it would actually be applicable.
I imagine the basic auto vectorization optimization would be generic across architectures, and follow the usual recipe of identifying induction variables and slice addressing using those induction variables. Machine-dependent is whether there are vector ops to care about (though software pipelining is also an option) whether their loads and stores carry extra alignment restrictions, and the vector sizes.