How to test the 'noasm' code?

128 views
Skip to first unread message

Sam Vilain

unread,
Apr 25, 2023, 11:37:55 AM4/25/23
to golang-nuts
I have a module that has a couple of assembly functions (for CLZ aka BSR/LZCNT, which despite widespread availability[1] don't get any language support).  So I've got the assembly versions in per–arch files, and a "noasm" version that builds with "noasm" (amongst other typical conditions.

My question is: is there a way to specify this "noasm" when running 'go test'?

I tried using `-gcflags -complete`, but now I just get complaints about missing the function body:

    $ go test -gcflags '-complete' -v .
    ...
    ./primitives_asm.go:17:6: missing function body
    ./primitives_asm.go:20:6: missing function body
    FAIL ...

Is this a bug in the compiler, i.e. should "noasm" always be set with "-complete", which disables assembly/C ?  Or is there some other way to set 'noasm', like a define flag, that I missed?

Thanks,
Sam

Ian Lance Taylor

unread,
Apr 25, 2023, 11:59:28 AM4/25/23
to Sam Vilain, golang-nuts
Have you considered using the functions in the math/bits package?

I don't think you really said exactly how noasm works, but if it's a
build tag then the answer is "go test -tags=noasm". The conventional
name for a tag like that is "purego".

Ian

Sam Vilain

unread,
Apr 25, 2023, 1:03:28 PM4/25/23
to golang-nuts
Aha, thanks!  And there -tags is, on `go help build`.

I wasn't sure what 'noasm' was; I was copying from an example by searching my ~/go/pkg; in this case the 'snappy' source (github.com/golang/snappy).  I had assumed it was something that gets set automatically, if for whatever reason assembly is not available in the current toolchain.  It being a non–standard tag explains why my other searches and scanning of CLI help pages were fruitless.

My complaint about the language support... well, it's great that it's in the standard library.  I guess I was more meaning, here is an elementary operation for integers, supported on most architectures, but languages like C that "set the tone" of how we express operations skipped this one.  I realize punctuation was scarce when C was written and they had to draw the line somewhere, but since it didn't get its own operator, and so now we need to use tricks like the ones found in math/bits, or use the assembly escape hatch.  Perhaps it's just the way I write code but I often find myself wishing there was a simple way to get the rough scale of an integer without reaching for math.Log2. Leading zeroes is an integer log₂, and logarithms are common enough to get a button on most scientific calculators.

Looks like `math/bits` could use some assembly alternatives, too. Clever as those functions are, they're almost certainly not going to beat the microcode/silicon.  I'm looking into the contribution guidelines now!

Cheers,
Sam

Ian Lance Taylor

unread,
Apr 25, 2023, 1:48:57 PM4/25/23
to Sam Vilain, golang-nuts
On Tue, Apr 25, 2023 at 10:03 AM Sam Vilain <s...@vilain.net> wrote:
>
> Looks like `math/bits` could use some assembly alternatives, too. Clever as those functions are, they're almost certainly not going to beat the microcode/silicon. I'm looking into the contribution guidelines now!

Note that many of the math/bits functions are actually implemented
directly by the compiler. Check the final executable on your
processors of interest.

Ian

Sam Vilain

unread,
Apr 25, 2023, 1:57:07 PM4/25/23
to golang-nuts

Oh, man.  I was wondering why, when I built my assembly version;

// Counts trailing Zero bits
TEXT ·Ctz(SB),NOSPLIT,$0-1
    MOVQ val+0(FP), AX  // value to be counted
    TZCNTQ AX, BX
    MOVB BX, ret+8(FP)
    RET

I got this:

$ go build -o hybrid8b.o .
$ go tool objdump -s hybrid8b.Ctz hybrid8b.o
warning: GOPATH set to GOROOT (/home/samv/go) has no effect
TEXT .../pkg/hybrid8b.Ctz(SB) gofile..<autogenerated>
  gofile..<autogenerated>:1 0x7ee4d 4883ec18 SUBQ $0x18, SP
  gofile..<autogenerated>:1 0x7ee51 48896c2410 MOVQ BP, 0x10(SP)
  gofile..<autogenerated>:1 0x7ee56 488d6c2410 LEAQ 0x10(SP), BP
  gofile..<autogenerated>:1 0x7ee5b 48890424 MOVQ AX, 0(SP)
  gofile..<autogenerated>:1 0x7ee5f e800000000 CALL 0x7ee64 [1:5]R_CALL:.../pkg/hybrid8b.Ctz
  gofile..<autogenerated>:1 0x7ee64 450f57ff XORPS X15, X15
  gofile..<autogenerated>:1 0x7ee68 644c8b342500000000 MOVQ FS:0, R14 [5:9]R_TLS_LE
  gofile..<autogenerated>:1 0x7ee71 0fb6442408 MOVZX 0x8(SP), AX
  gofile..<autogenerated>:1 0x7ee76 488b6c2410 MOVQ 0x10(SP), BP
  gofile..<autogenerated>:1 0x7ee7b 4883c418 ADDQ $0x18, SP
  gofile..<autogenerated>:1 0x7ee7f c3 RET

By comparison, here's what I got by using math/bits.TrailingZeros64:

$ go build -tags noasm -o hybrid8b.o .
$ go tool objdump -s hybrid8b.Ctz hybrid8b.o
warning: GOPATH set to GOROOT (/home/samv/go) has no effect
TEXT .../pkg/hybrid8b.Ctz(SB) gofile../home/samv/.../pkg/hybrid8b/primitives_misc.go
  primitives_misc.go:34 0x62130 480fbcc0 BSFQ AX, AX
  primitives_misc.go:34 0x62134 b940000000 MOVL $0x40, CX
  primitives_misc.go:34 0x62139 480f44c1 CMOVE CX, AX
  primitives_misc.go:34 0x6213d c3 RET

I was thinking, that's an interesting thing to be emitted by this source: 

func TrailingZeros64(x uint64) int {
    if x == 0 {
        return 64
    }
    // ...
    return int(deBruijn64tab[(x&-x)*deBruijn64>>(64-6)])
}

As I was thinking, this is presumably some kind of targeted function replacement done by the compiler.  This is what I was hoping to do by writing the assembly function, but it seems I'm being bitten by all the stuff relating to the calling convention, register saving, etc.

I guess my core problem is already solved, but my questions are:

(a) where can I find how this specific optimization is defined?

(b) is it possible to write assembly functions that avoid the wrapper code, assuming that one follows the platform's calling convention?

Sam

Ian Lance Taylor

unread,
Apr 25, 2023, 5:20:32 PM4/25/23
to Sam Vilain, golang-nuts
On Tue, Apr 25, 2023 at 10:57 AM Sam Vilain <s...@vilain.net> wrote:
>
> (a) where can I find how this specific optimization is defined?

It's in the compiler. It's not especially easy to pull it out. In
this specific case, it's something like

cmd/compile/internal/ssagen/ssa.go:

addF("math/bits", "TrailingZeros64",
func(s *state, n *ir.CallExpr, args []*ssa.Value) *ssa.Value {
return s.newValue1(ssa.OpCtz64, types.Types[types.TINT], args[0])
},
sys.AMD64, sys.I386, sys.ARM64, sys.ARM, sys.S390X, sys.MIPS,
sys.PPC64, sys.Wasm)


cmd/compile/internal/ssa/_gen/AMD64.rules:

(Ctz64 x) && buildcfg.GOAMD64 >= 3 => (TZCNTQ x)

cmd/compile/internal/ssa/_gen/AMD64Ops.go:

// count the number of trailing zero bits, prefer TZCNTQ over BSFQ, as
TZCNTQ(0)==64
// and BSFQ(0) is undefined. Same for TZCNTL(0)==32
{name: "TZCNTQ", argLength: 1, reg: gp11, asm: "TZCNTQ", clobberFlags: true},


> (b) is it possible to write assembly functions that avoid the wrapper code, assuming that one follows the platform's calling convention?

Go uses its own calling convention. There is an internal
register-based ABI, but I don't think it's stable. See
https://go.googlesource.com/proposal/+/refs/heads/master/design/40724-register-calling.md.

Ian
Reply all
Reply to author
Forward
0 new messages