// byte swap
// let (a, b, c, d) be the bytes of x from high to low
// t1 = x right rotate 16 bits -- (c, d, a, b )
// t2 = x ^ t1 -- (a^c, b^d, a^c, b^d)
// t3 = t2 &^ 0xff0000 -- (a^c, 0, a^c, b^d)
// t4 = t3 >> 8 -- (0, a^c, 0, a^c)
// t5 = x right rotate 8 bits -- (d, a, b, c )
// result = t4 ^ t5 -- (d, c, b, a )
// using shifted ops this can be done in 4 instructions.
(Bswap32 <t> x) ->
(XOR <t>
(SRLconst <t> (BICconst <t> (XOR <t> x (SRRconst <t> [16] x)) [0xff0000]) [8])
(SRRconst <t> x [8]))
I thought they can be optimized to a simple REV instruction, which is avaliable in ARMv6. This instruction is described on page 562 of the "ARM Architecture Reference Manual" rev C.c.
If I want to make this change, should I follow the "contributing Guildline"
https://golang.org/doc/contribute.html ?
Ben Shi