atomic.Add*() doesn't produce unique values

226 views
Skip to first unread message

Liam

unread,
Nov 29, 2019, 9:21:22 PM11/29/19
to golang-nuts
Does atomic.AddInt32(&x, 1) always yield unique values for concurrent callers?

I'm guessing not, because (I think) I'm seeing that two callers get x+2, neither gets x+1.

Is there a way to generate unique values with pkg atomic, or is a mutex required?

burak serdar

unread,
Nov 29, 2019, 9:25:41 PM11/29/19
to Liam, golang-nuts
On Fri, Nov 29, 2019 at 7:21 PM Liam <networ...@gmail.com> wrote:
>
> Does atomic.AddInt32(&x, 1) always yield unique values for concurrent callers?
>
> I'm guessing not, because (I think) I'm seeing that two callers get x+2, neither gets x+1.

That should not happen, afaik. Do you have code you can share?

>
> Is there a way to generate unique values with pkg atomic, or is a mutex required?
>
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/b8cd30f3-effe-4d1d-ac5b-4bb29bbd9a10%40googlegroups.com.

Kurtis Rader

unread,
Nov 29, 2019, 9:41:39 PM11/29/19
to Liam, golang-nuts
On Fri, Nov 29, 2019 at 6:21 PM Liam <networ...@gmail.com> wrote:
Does atomic.AddInt32(&x, 1) always yield unique values for concurrent callers?

I'm guessing not, because (I think) I'm seeing that two callers get x+2, neither gets x+1.

That shouldn't happen, AFAICT. Can you share the code where the incorrect behavior is occurring? Or, preferably, a simple reproducer program?
  
Is there a way to generate unique values with pkg atomic, or is a mutex required?

Keep in mind that atomic.AddInt32() has the usual two's-complement  overflow semantics. If all you want is a generation counter you really should be using a uint32 and atomic.AddUint32(). Also, depending on your preferences and performance considerations you might find it preferable to use a channel that holds a single int, or small number of ints, that is fed by a producer goroutine and consumed by any context needing a uniq ID. That makes it easier to abstract the generation of "unique" ints so that they satisfy other constraints (e.g., they must be even, odd, prime, etc.).

--
Kurtis Rader
Caretaker of the exceptional canines Junior and Hank

Liam

unread,
Nov 30, 2019, 2:56:02 PM11/30/19
to golang-nuts
The stress test for my app fails frequently with what looks like a collision in atomic.AddUint64() results, so I wondered whether I had misunderstood atomic-add.

So far I can't reproduce it with a small program, so I've probably misunderstood my app :-)

Robert Engels

unread,
Nov 30, 2019, 4:02:23 PM11/30/19
to Liam, golang-nuts
If this was broken I think a lot of things would break. 

On Nov 30, 2019, at 1:56 PM, Liam <networ...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Michael Jones

unread,
Nov 30, 2019, 6:38:51 PM11/30/19
to Robert Engels, Liam, golang-nuts
Liam,

I just wrote a little stress test program for you. Maybe it will make you less stressed. ;-)

4 CPU 2016 MacBook Pro:
celeste:atom mtj$ go run main.go
32 concurrent workers
128 batches of 1048576 atomic increments, 134217728 total increments
2.850 seconds elapsed, 47088064 atomic increments/sec
0 collisions

18 CPU 2019 iMacPro:
plum:atom mtj$ go run main.go
32 concurrent workers
128 batches of 1048576 atomic increments, 134217728 total increments
2.730 seconds elapsed, 49167382 atomic increments/sec
0 collisions

Exhaustive demonstration is no proof, but changing the parameters here may increase your comfort.

Michael



--
Michael T. Jones
michae...@gmail.com

Michael Jones

unread,
Nov 30, 2019, 8:33:50 PM11/30/19
to Robert Engels, Liam, golang-nuts
As a follow-up, some more timing:

47088064 atomic increments/sec (my original email above for heavy synchronization conflict incrementing)

142049067 atomic increments/sec when each goroutine has its own atomic update target. (Not testing global synchronization/mutex, just the overhead of congested vs not.)

426232527 ordinary "x++" increments in the workers.

General idea to remember:

Atomic increment is ~3x slower than simple add when uncontested.
Highly contested atomic increment is ~3x closer than uncontested, therefore ~9x-10x slower than simple add.

10x is not insignificant, but is nevertheless remarkable for a reliable atomic operation. This was once, "back in the day", a remarkably expensive operation, an a feat of genius to accomplish (Dekker's Algorithm). That it is now just a number-of-fingers cycles is fantastic progress!

Liam

unread,
Nov 30, 2019, 10:06:41 PM11/30/19
to golang-nuts
I wrote a less-sophisticated version of your test, then realized I'd misspent my time; all I needed was to change the atomic.Add*() to a mutex-protected counter, and see whether my app still failed; it did.

But since you took the trouble, I read your code, and would like to understand your collision detector. Could you explain this bit?

for _, v := range a {
  mask := byte(1 << (v & (8 - 1)))
  index := v >> 3

  if tally[index]&mask != 0 { ... }
  ...
To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.


--
Michael T. Jones
michae...@gmail.com

Michael Jones

unread,
Dec 1, 2019, 5:21:55 PM12/1/19
to Liam, golang-nuts
Oh! That's just a bit per integer in the test range 0..total-1. Since Go (and everything else) lacks a bit type, I just type such code automatically. Bytes hold 8 bits. Array size must be rounded up, so

a := make([]byte, (total+8-1)/8)

array index for test integer n is n/8, so "n>>3"

bit index for the j-th bit, counting up from 0 for the 1's place is "1<<j"

j is n%8, so "n&(8-1)"

if mask=1<<(n&(8-1)) then one can test if the bit is set with

a[n>>3] & mask != 0

to set it is 

a[n>>3] |= mask

the values 3 and 8 here are from 8 bits in a byte and 8 = 2**3. if using 64-bit ints they become 6 and 64. 

To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/4d091a92-707a-40dc-8d1b-f12e10426438%40googlegroups.com.

robert engels

unread,
Dec 1, 2019, 5:32:30 PM12/1/19
to Michael Jones, Liam, golang-nuts
I’d prefer v / 8 over v >> 3 - provides more context in my opinion. The compiler will change to right shift if more efficient anyway.

Michael Jones

unread,
Dec 1, 2019, 8:09:10 PM12/1/19
to robert engels, Liam, golang-nuts
agree, makes sense. also if you trust the compiler, change to use the mod function, etc. 

Liam

unread,
Dec 1, 2019, 10:18:43 PM12/1/19
to golang-nuts
Oh you've allocated a bit array for every value in the test range, then checked for gaps in it?

Robert Engels

unread,
Dec 1, 2019, 10:46:39 PM12/1/19
to Liam, golang-nuts
The updating of the bit array if shared needs to atomic as well, probably with a read and cas.  

On Dec 1, 2019, at 9:19 PM, Liam <networ...@gmail.com> wrote:


To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/4a457f1f-7956-474a-b29a-909aee0e55c3%40googlegroups.com.

Michael Jones

unread,
Dec 1, 2019, 11:44:57 PM12/1/19
to Robert Engels, Liam, golang-nuts
not necessary as the testing and updating is only done in one place by one the main goroutine. 

Michael Jones

unread,
Dec 1, 2019, 11:48:17 PM12/1/19
to Robert Engels, Liam, golang-nuts
"Oh you've allocated a bit array for every value in the test range, then checked for gaps in it?"

Yes. What I should have said. (Though the test looks not for gaps but for two pigeons in one hole, but the same idea.)

Robert Engels

unread,
Dec 2, 2019, 1:14:38 AM12/2/19
to Michael Jones, Liam, golang-nuts
Sorry, I hadn’t looked at the code, but after reviewing doesn’t your code not detect the case where a value was skipped ? (... not that it could happen - but for completeness of the robust test)

Michael Jones

unread,
Dec 2, 2019, 1:24:33 AM12/2/19
to Robert Engels, Liam, golang-nuts
true. not roust that way. also, it uses "Go array bounds checking" as the implicit test for crazy out of range values. just a 5 min stream of consciousness hack to defend the honor of the Go atomic primitives. ;-) It was typed one handed while the the other propped up my macbook here in bed. easily improved.

in fact, i did improve it after sending that original email. i added outer loops to time repeatedly, and above that, to iterate across numbers of workers. also added an option to have the workers do a little work between atomic operations just to see the expected "atomic cost in the noise" by direct experience...

...works great, scales well.

4cpu+4smt cpus
celeste:atom mtj$ time atom
  1 workers, 256 batches of 131072 atomic increments, 33554432 total increments: 18.21601 sec
  2 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  9.44632 sec
  3 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  6.77645 sec
  4 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  5.07302 sec
  5 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  4.54578 sec
  6 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  3.59711 sec
  7 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  3.21376 sec
  8 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  2.98081 sec

18+18
xplum:atom mtj$ time atom
  1 workers, 256 batches of 131072 atomic increments, 33554432 total increments: 15.36729 sec
  2 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  8.05586 sec
  3 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  5.71047 sec
  4 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  4.25619 sec
  5 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  3.50413 sec
  6 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  2.95954 sec
  7 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  2.57278 sec
  8 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  2.29949 sec
  9 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  2.14102 sec
 10 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.96580 sec
 11 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.86208 sec
 12 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.75674 sec
 13 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.67355 sec
 14 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.60104 sec
 15 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.55347 sec
 16 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.46876 sec
 17 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.47974 sec
 18 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.46271 sec
 19 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.42377 sec
 20 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.37417 sec
 21 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.38023 sec
 22 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.32782 sec
 23 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.33747 sec
 24 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.27842 sec
 25 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.27863 sec
 26 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.22864 sec
 27 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.21858 sec
 28 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.21807 sec
 29 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.16772 sec
 30 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.15762 sec
 31 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.15192 sec
 32 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.10269 sec
 33 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.09875 sec
 34 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.09120 sec
 35 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.08135 sec
 36 workers, 256 batches of 131072 atomic increments, 33554432 total increments:  1.09140 sec

Reply all
Reply to author
Forward
0 new messages