Slow 64 bit integer division on amd64

685 views
Skip to first unread message

Dominik Honnef

unread,
Jan 19, 2013, 4:13:58 PM1/19/13
to golan...@googlegroups.com
Hi,

due to the change to 64 bit ints in tip, I noticed that 64 bit integer
division is almost three times slower than 32 bit integer division, on an
amd64 system[1].

I created a small collection of benchmarks and disassemblies[2] that
demonstrate the behaviour.

To be honest I lack the deeper understanding to say if this difference
in performance is to be expected or something that can be fixed, which
is why I am asking you to take a look at it.

In the case that this is actually expected, how should we deal with it
in future versions of Go, where int implies int64 on 64 bit systems?
Using int32 explicitly will definitely lead to less nice code because of
a lot of required conversions.


Thanks,
Dominik


[1]: Linux dominikh-pc 3.2.1-gentoo-r2 #7 SMP Mon May 7 16:40:12 CEST
2012 x86_64 Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz GenuineIntel
GNU/Linux

[2]: http://play.golang.org/p/4CbUZSj2lr

minux

unread,
Jan 19, 2013, 4:20:47 PM1/19/13
to golan...@googlegroups.com
On Sun, Jan 20, 2013 at 5:13 AM, Dominik Honnef <domi...@fork-bomb.org> wrote:
due to the change to 64 bit ints in tip, I noticed that 64 bit integer
division is almost three times slower than 32 bit integer division, on an
amd64 system[1].
This is to be expected.
doing 64-bit division will (must) be slower than 32-bit division,
because AFAIK, all division circuits use some sort of iteration
algorithm (that is, don't expect a single cycle division circuit
to be built with adequate clock frequency as the other single
cycle arithmetic circuits).

Jan Mercl

unread,
Jan 19, 2013, 4:22:58 PM1/19/13
to golang-nuts
On Sat, Jan 19, 2013 at 10:13 PM, Dominik Honnef <domi...@fork-bomb.org> wrote:

Cannot reproduce here (Xeon on dont-know-how-much GHz)

jnml@fsc-r630:~/src/tmp/20130119$ cat a_test.go
package main

import (
"testing"
)

func BenchmarkBase(b *testing.B) {
var x int32
for i := 1; i < b.N; i++ {
}
_ = x
}

func BenchmarkInt(b *testing.B) {
var x int
for i := 1; i < b.N; i++ {
x /= 42
}
_ = x
}

func BenchmarkInt32(b *testing.B) {
var x int32
for i := 1; i < b.N; i++ {
x /= 42
}
_ = x
}

func BenchmarkInt64(b *testing.B) {
var x int64
for i := 1; i < b.N; i++ {
x /= 42
}
_ = x
}
jnml@fsc-r630:~/src/tmp/20130119$ go test -bench .
testing: warning: no tests to run
PASS
BenchmarkBase 2000000000 0.88 ns/op
BenchmarkInt 100000000 10.3 ns/op
BenchmarkInt32 500000000 4.85 ns/op
BenchmarkInt64 100000000 10.2 ns/op
ok tmp/20130119 6.870s
jnml@fsc-r630:~/src/tmp/20130119$ go version
go version devel +422c00e8e022 Sun Nov 11 07:51:20 2012 +1100
jnml@fsc-r630:~/src/tmp/20130119$

-j

minux

unread,
Jan 19, 2013, 4:25:33 PM1/19/13
to Jan Mercl, golang-nuts
On Sun, Jan 20, 2013 at 5:22 AM, Jan Mercl <0xj...@gmail.com> wrote:
On Sat, Jan 19, 2013 at 10:13 PM, Dominik Honnef <domi...@fork-bomb.org> wrote:

Cannot reproduce here (Xeon on dont-know-how-much GHz)

jnml@fsc-r630:~/src/tmp/20130119$ cat a_test.go
package main

import (
        "testing"
)

func BenchmarkBase(b *testing.B) {
        var x int32
        for i := 1; i < b.N; i++ {
        }
        _ = x
}

func BenchmarkInt(b *testing.B) {
        var x int
        for i := 1; i < b.N; i++ {
        x /= 42
        }
        _ = x
}
division by a constant should be turned into multiplication by magic constants
by the Go compiler just to avoid the costly divide instruction.

you can verify that by 'go tool 6g -S'.

Michael Jones

unread,
Jan 19, 2013, 6:10:34 PM1/19/13
to minux, Jan Mercl, golang-nuts
Integer division is slow. 

It is the unloved arithmetic operation in CPU design because of its low frequency of occurrence in the instruction stream. It is the one--if you remember childhood schooling--that was "not like the others" in its mechanism: there was lots of guessing using leading digits, trial divisors, and so on. This awkwardness exists even in base 2 (though much less) and means that any division circuit my have to do at least one "fixup" step. (A good design does either zero or one fixup, but that one means more work and plausibly an extra cycle on what is already a long cycle-count activity.)

As one specific example, when adding and subtracting were 1 cycle and multiply was 4, a 32-bit integer division required 19 cycles.


--
 
 



--
Michael T. Jones | Chief Technology Advocate  | m...@google.com |  +1 650-335-5765

ra...@cockroachlabs.com

unread,
Nov 8, 2016, 12:18:02 AM11/8/16
to golang-nuts, minu...@gmail.com, 0xj...@gmail.com
Sorry to revive an old thread.  One thing to note is that dividing a 64-bit by a 32-bit is much faster than dividing a 64-bit by a 64-bit, at least on recent Intel platforms. Unfortunately there is no valid way to express that kind of calculation in Go..

Is the compiler/optimizer smart enough to figure out it can do that if you do something like `uint32(a / uint64(b))` where a is `uint64` and b is `uint32`? (the result of that expression can be computed using the 32-bit version of DIV)

andrew_...@trimble.com

unread,
Nov 8, 2016, 12:55:01 AM11/8/16
to golang-nuts, minu...@gmail.com, 0xj...@gmail.com, ra...@cockroachlabs.com
I imagine the new SSA backend can do this with its pattern matching, but you would need to look for it here:

https://github.com/golang/go/blob/master/src/cmd/compile/internal/ssa/gen/generic.rules
or
https://github.com/golang/go/blob/master/src/cmd/compile/internal/ssa/gen/AMD64.rules

Maybe someone else can confirm.

alb.do...@gmail.com

unread,
Nov 10, 2016, 8:46:02 AM11/10/16
to golang-nuts, minu...@gmail.com, 0xj...@gmail.com, ra...@cockroachlabs.com
When you do a DIV r32 in 64bit mode the result will be saved in
EAX but this means that you can only do that if the high word
of the dividend is smaller that the divisor.

You can't unconditionally replace uint32(A / uint64(B)) with a
DIV r32 instruction because if A is big (take 2**62) and B is
small (take 13) then the quotient will overflow EAX.

A.

ra...@cockroachlabs.com

unread,
Nov 10, 2016, 9:01:06 AM11/10/16
to golang-nuts, minu...@gmail.com, 0xj...@gmail.com, ra...@cockroachlabs.com, alb.do...@gmail.com
I don't understand. We are truncating the result to 32-bits either way. What's the point of computing the entire 64-bits of the result just to throw away the high 32?

In your example, A=2^62 and B=13`, `uint32(A / uint64(B))` is (2^62 / 13) % 2^32 which I thought is exactly the result of DIV r32.

-Radu

alb.do...@gmail.com

unread,
Nov 10, 2016, 9:05:53 AM11/10/16
to golang-nuts, minu...@gmail.com, 0xj...@gmail.com, ra...@cockroachlabs.com, alb.do...@gmail.com
If the div overflow, it will trigger and hardware exception.

ra...@cockroachlabs.com

unread,
Nov 10, 2016, 9:30:55 AM11/10/16
to golang-nuts, minu...@gmail.com, 0xj...@gmail.com, ra...@cockroachlabs.com, alb.do...@gmail.com
Ah, I missed that, thanks!

Michael Jones

unread,
Nov 10, 2016, 9:40:39 AM11/10/16
to alb.do...@gmail.com, golang-nuts, minu...@gmail.com, 0xj...@gmail.com, ra...@cockroachlabs.com

A=2^62

B=13

 

A/B = 4611686018427387904/13

A/B = quotient 354745078340568300 with remainder 4

A/B = (82595524*2^32 + 3964585196) with remainder 4

 

Consider the quotient:

H = Upper 32 bits of 354745078340568300 = 82595524

L = Lower 32 bits of 354745078340568300 = 3964585196

 

Apparently you want L.

 

uint32(uint64(354745078340568300)) = 3964585196

 

The overflow condition is that the quotient is not representable in 32 bits. Reference:

http://stackoverflow.com/questions/22596059/problems-dividing-64-bits-in-x86-assembly

Nick Craig-Wood

unread,
Nov 13, 2016, 4:59:48 AM11/13/16
to alb.do...@gmail.com, golang-nuts, minu...@gmail.com, 0xj...@gmail.com, ra...@cockroachlabs.com
Traditionally you'd write that sort of thing in assembler to get that
amount of control. Though the modern thing would be to have intrinsics..

I guess in the go world, having assembler routines that could be inlined
would be the optimum solution.
> --
> You received this message because you are subscribed to the Google
> Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to golang-nuts...@googlegroups.com
> <mailto:golang-nuts...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.


--
Nick Craig-Wood <ni...@craig-wood.com> -- http://www.craig-wood.com/nick
Reply all
Reply to author
Forward
0 new messages