Conditional move & hoist optimizations

105 views
Skip to first unread message

Oliver Smith

unread,
Nov 10, 2020, 7:05:17 PM11/10/20
to golang-nuts
Looking at how go compiles/optimizes a couple of common constructs, which I boiled down to a simple '\t' replacement loop.

godbolt of sources: https://go.godbolt.org/z/Pnf3vh

```
func v1(s []byte, detab bool) (d []byte) {
  d = make([]byte, len(s))
  for i := 0; i < len(s); i++ {
    char := s[i]
    if detab && char == '\t' {
      char = ' '
    }
    d[i] = char
  }
  return d
}
```

which is branch happy:

```
v1_pc93:
        movb    DIB, (AX)(SI*1)
        incq    SI
v1_pc100:
        cmpq    SI, CX
        jge     v1_pc126
        movblzx (BX)(SI*1), DI
        testb   DL, DL
        jeq     v1_pc93
        cmpb    DIB, $9
        jne     v1_pc93
        movl    $32, DI
        jmp     v1_pc93
```

Coming from a C background, my first surprise was that the loop invariant wasn't hoisted, it didn't generate two versions of the loop body predicated on 'detab', but my second surprise was that it generates branch operations rather than a simple conditional move.

I do sort of love that go rewards you a little for letting it do a bit more of the lifting:

```
func v2(s []byte, detab bool) (d []byte) {
  d = make([]byte, len(s))
  for i := 0; i < len(s); i++ {
    if detab && s[i] == '\t' {
      d[i] = ' '
    } else {
      d[i] = s[i]
    }
  }
  return d
}
```

but that's countered by the failure to eliminate the invariant, which I can do manually thus:

```
func v3(s []byte, detab bool) (d []byte) {
  d = make([]byte, len(s))
  tabReplacement := byte('\t')
  if detab {
      tabReplacement = byte(' ')
  }
  for i := 0; i < len(s); i++ {
    if s[i] == '\t' {
      d[i] = tabReplacement
    } else {
      d[i] = s[i]
    }
  }
  return d
}
```

With manual unrolling to reduce the conditions, I still don't see the hoped-for cmov:

```
func v4(s []byte, detab bool) (d []byte) {
    d = make([]byte, len(s))
    if detab {
        for i := 0; i < len(s); i++ {
            c := s[i]
            if c == '\t' {
                c = ' '
            }
            d[i] = c
        }
    } else {
        for i := 0; i < len(s); i++ {
            d[i] = s[i]
        }
    }
    return d
}
```

produces

```
        jmp     v4_pc105
v4_pc98:
        movb    SIB, (AX)(BX*1)
        incq    BX
v4_pc105:
        cmpq    BX, CX
        jge     v4_pc127
        movblzx (DX)(BX*1), SI
        cmpb    SIB, $9
        jne     v4_pc98
        movl    $32, SI
        jmp     v4_pc98
```

what I was hoping to produce was:

```
v4_pc98:
        cmpq    BX, CX
        jge         v4_pc127
        movblzx (DX)(BX*1), SI
        cmpb    SIB, $9
        cmovl   $32, SI
        movb    SIB, (AX)(BX*1)
        incq    BX
        jmp     v4_pc98
```

anything along the lines of: https://gcc.godbolt.org/z/jvhj5a

My questions: Is any form of loop/invariant-hoisting performed? Does the compiler ever produce conditional moves? Where would I even look in the (go src) code?


Keith Randall

unread,
Nov 11, 2020, 6:55:23 PM11/11/20
to golang-nuts
> My questions: Is any form of loop/invariant-hoisting performed?

Yes, but not much. We hoist simple things like constants and spill restores. The big kahuna, loads, are typically not lifted because they depend on memory and we don't have the alias analysis to prove that that is safe.

We definitely don't do any loop duplication + specialization.

 > Does the compiler ever produce conditional moves? Where would I even look in the (go src) code?

Yes. Look in cmd/compile/internal/ssa/branchelim.go. We generate generic CondSelect operations there, then lowering passes convert to machine-specific conditional moves.
It looks like in your case we don't generate conditional moves because amd64 doesn't have byte-sized conditional moves (see canCondSelect in the mentioned file). If you make c an int instead of a byte, with the appropriate casts, then it works.
I think that no-byte-cmovs restriction can be lifted. It might cause a partial register stall, but maybe that's ok.
Reply all
Reply to author
Forward
0 new messages