Idiomatic Go for atomic boolean flag situation?

15,210 views
Skip to first unread message

Jake Brukhman

unread,
Feb 16, 2012, 11:30:04 AM2/16/12
to golang-nuts
I have a bunch of data coming in on a channel:

// "data" channel is set up
for {
v := <-data
processData(v)
}

Now, I wish to "record" some portions of this stream, and so I have a
flag that tests when recording is to take place:

for {
v := <-data
if recording {
record(v)
}
processData(v)
}

However, I wish to be able to flip this flag atomically from different
goroutines. For instance, in Java I would use an AtomicBoolean to
synchronize the flag. What is idiomatic Go for this? I could check a
"recording" channel each time, but a whole select statement seems
overkill for something so simple.

Patrick Mylund Nielsen

unread,
Feb 16, 2012, 11:50:18 AM2/16/12
to Jake Brukhman, golang-nuts
In this case it may be simplest to use sync.Mutex:

var (
rec bool
mu = &sync.Mutex{}
)

func recording() bool {
mu.Lock()
defer mu.Unlock()
return rec
}

func setRecording(s bool) {
mu.Lock()
defer mu.Unlock()
rec = s
}

...
for {
v := <-data
if recording() {
record(v)
}
processData(v)

Jake Brukhman

unread,
Feb 16, 2012, 11:54:51 AM2/16/12
to golang-nuts
Yes, so I'd be providing an atomic boolean myself. One question:

func recording() bool {
mu.Lock()
defer mu.Unlock()
return rec
}

Is that actually thead-safe? I thought that the lock was released
before the return statement is executed?

On Feb 16, 11:50 am, Patrick Mylund Nielsen

Patrick Mylund Nielsen

unread,
Feb 16, 2012, 11:56:59 AM2/16/12
to Jake Brukhman, golang-nuts
Should be. Deferreds are called after the returning statement is
evaluated, but before it's returned to the caller:
http://golang.org/doc/go_spec.html#Defer_statements

Vanja Pejovic

unread,
Feb 16, 2012, 12:01:37 PM2/16/12
to Patrick Mylund Nielsen, Jake Brukhman, golang-nuts
How important is immediate response / order of operations? If you don't mind recording a value when recording is false, and you don't mind not recording a value when recording is set to true, then you should be able to do it without any synchronization. As far as I know, multiple reads and writes from a single boolean variable will not result in data corruption, but the updates to the value could be delayed, hypothetically indefinitely.

If this is not ok with you, I would recommend a slight modification to Patrick's method since his has a race condition (if recording() returns, then rec is changed, then if recording() is evaluated):

var (
       rec bool
       mu = &sync.Mutex{}
)

func maybeRecord(v V) {
       mu.Lock()
       defer mu.Unlock()
       if rec {
              record(v)
       }
}

func setRecording(s bool) {
       mu.Lock()
       defer mu.Unlock()
       rec = s
}

...
for {
      v := <-data
      rmaybeRecord(v)
      processData(v)

Patrick Mylund Nielsen

unread,
Feb 16, 2012, 12:14:09 PM2/16/12
to Vanja Pejovic, Jake Brukhman, golang-nuts
Good point; the record method could just do the sync. And yeah, if you
don't synchronize reads, goroutines are not guaranteed to ever receive
updates.

Patrick Mylund Nielsen

unread,
Feb 16, 2012, 12:25:37 PM2/16/12
to Vanja Pejovic, Jake Brukhman, golang-nuts
On second thought, keeping the mutex locked for the entire duration
it's recording may not be desirable. Ensuring updates are received
relatively quickly and living with the race condition--if recording()
{ go record(v) }--would probably be okay here.

On Thu, Feb 16, 2012 at 18:14, Patrick Mylund Nielsen

Vanja Pejovic

unread,
Feb 16, 2012, 12:47:39 PM2/16/12
to Patrick Mylund Nielsen, Jake Brukhman, golang-nuts
Good point, or if recording needs to be done sequentially:

func maybeRecord(v V) {
       mu.Lock()
       toRec := rec
       mu.Unlock()
       if toRec {
              record(v)
       }
}

or in a seperate goroutine:

func maybeRecord(v V) {
       mu.Lock()
       defer mu.Unlock()
       if rec {
              toRecord <- v
       }
}
and read from toRecord in the other goroutine

Jake Brukhman

unread,
Feb 16, 2012, 1:16:29 PM2/16/12
to Vanja Pejovic, Patrick Mylund Nielsen, golang-nuts
I don't anticipate writes to be happening so frequently that the situation described above is going to matter. Actually, I'm not 100% that what you described can happen... have you seen it? It seems to me that if statements should be evaluated atomically to the reads they contain, no?

I wrote an AtomicValue object that looks like this: 

import "sync"

// AtomicValue holds an arbitrary
// value whose writes and reads
// are synchronized against a
// read-write lock
type AtomicValue struct {
    Value interface{}
    lock *sync.RWMutex
}

func New() *AtomicValue {
    return &AtomicValue{
        lock: &sync.RWMutex{},
    }
}

func NewWithValue(v interface{}) *AtomicValue {
    return &AtomicValue{
        Value: v,
        lock: &sync.RWMutex{},
    }
}

func (a *AtomicValue) Set(v interface{}) {
    a.lock.Lock()
    defer a.lock.Unlock()
    a.Value = v
}

func (a *AtomicValue) Get() interface{} {
    a.lock.RLock()
    defer a.lock.RUnlock()
    return a.Value

Vanja Pejovic

unread,
Feb 16, 2012, 1:40:45 PM2/16/12
to Jake Brukhman, Patrick Mylund Nielsen, golang-nuts
I am not an expert on this but this is generally how I think about it:
A context switch (changing from one thread of goroutine to another) can happen between any two machine instructions. Since returning the value of recording() and evaluating the if statement are not the same machine statement, they are not evaluated atomically.

minux

unread,
Feb 16, 2012, 1:48:55 PM2/16/12
to Vanja Pejovic, Jake Brukhman, Patrick Mylund Nielsen, golang-nuts
On Fri, Feb 17, 2012 at 2:40 AM, Vanja Pejovic <vva...@gmail.com> wrote:
I am not an expert on this but this is generally how I think about it:
A context switch (changing from one thread of goroutine to another) can happen between any two machine instructions. 
Currently, goroutine scheduling is not preemptive, so this won't happen (by Go scheduler).
In fact, the goroutine will run as long as it can be (e.g. not block by channel send/recv, syscall, mutex, etc.).

Of course, the OS can preempt one thread at any time, so if your GOMAXPROCS > 1, then you are correct.

Bobby Powers

unread,
Feb 16, 2012, 3:11:36 PM2/16/12
to minux, Vanja Pejovic, Jake Brukhman, Patrick Mylund Nielsen, golang-nuts
just use CAS:

import "sync/atomic"

var recording int32

func isRecording() bool {
return recording > 0
}

func setRecording(shoudRecord bool) {
if shouldRecord {
atomic.CompareAndSwapInt32(&recording, 0, 1)
} else {
atomic.CompareAndSwapInt32(&recording, 1, 0)
}
}

Or am I missing something?

Henrik Johansson

unread,
Feb 16, 2012, 3:26:47 PM2/16/12
to Bobby Powers, golang-nuts, minux, Jake Brukhman, Vanja Pejovic, Patrick Mylund Nielsen

Thats how the corresponding java constructs mentioned earlier would have been used. So yes i guess.

Patrick Mylund Nielsen

unread,
Feb 16, 2012, 3:28:21 PM2/16/12
to Henrik Johansson, Bobby Powers, golang-nuts, minux, Jake Brukhman, Vanja Pejovic
It has the same problems, no? isRecording is never guaranteed to
return the current value?

Does sync/atomic work differently?

Bobby Powers

unread,
Feb 16, 2012, 4:25:23 PM2/16/12
to Patrick Mylund Nielsen, Henrik Johansson, golang-nuts, minux, Jake Brukhman, Vanja Pejovic
sync/atomic CAS functions are implemented on x86_64 as a LOCK CMPXCHG
pair of instructions, and behave as expected expected:

According to Chapter 8 Bus Locking, of the Intel 64 and IA-32
Architectures Software Developer’s Manual, Volume 3A
The memory-ordering model prevents loads and stores from being
reordered with locked instructions that execute earlier or later. [1]

So isRecording is guaranteed to return the correct value.

Bobby

1 - http://stackoverflow.com/questions/9027590/do-we-need-mfence-when-using-xchg

Patrick Mylund Nielsen

unread,
Feb 16, 2012, 4:50:46 PM2/16/12
to Bobby Powers, Henrik Johansson, golang-nuts, minux, Jake Brukhman, Vanja Pejovic
Very cool!

Vanja Pejovic

unread,
Feb 16, 2012, 4:53:43 PM2/16/12
to Bobby Powers, Patrick Mylund Nielsen, Henrik Johansson, golang-nuts, minux, Jake Brukhman
I'm not sure about that, consider this scenario:

- recording is 1
- isRecording is called, the boolean expression (recording > 0) is evaluated (to true) and stored
- switch context to a different thread that calls setRecording(false), and complete that method
- switch context back to isRecording and return result of the evaluated expression (true). (even though recording == 0 now)

Bobby Powers

unread,
Feb 16, 2012, 5:48:18 PM2/16/12
to Vanja Pejovic, Patrick Mylund Nielsen, Henrik Johansson, golang-nuts, minux, Jake Brukhman
The go compilers have inlining turned on by default; isRecording()
should be inlined when used like:

if isRecording() {
...
}

Otherwise, you're right. You can always explicitly do:

if recording > 0 {
...
}

yours,
Bobby

Evan Shaw

unread,
Feb 16, 2012, 6:47:06 PM2/16/12
to Bobby Powers, minux, Vanja Pejovic, Jake Brukhman, Patrick Mylund Nielsen, golang-nuts
On Fri, Feb 17, 2012 at 9:11 AM, Bobby Powers <bobby...@gmail.com> wrote:
> import "sync/atomic"
>
> var recording int32
>
> func isRecording() bool {
>    return recording > 0
> }
>
> func setRecording(shoudRecord bool) {
>    if shouldRecord {
>        atomic.CompareAndSwapInt32(&recording, 0, 1)
>    } else {
>        atomic.CompareAndSwapInt32(&recording, 1, 0)
>    }
> }

In practice this code will probably work, but it doesn't set a good
example. Instead of a compare-and-swap, you just want an atomic store.
recording should also be loaded atomically in isRecording.

import "sync/atomic"

var recording int32

func isRecording() bool {
return atomic.LoadInt32(&recording) != 0
}

func setRecording(shouldRecord bool) {
if shouldRecord {
atomic.StoreInt32(&recording, 1)
} else {
atomic.StoreInt32(&recording, 0)
}
}

This still doesn't avoid the tricky problem Vanja pointed out, but I'm
under the impression that Jake just wants to avoid getting his flag
somehow corrupted. For that case, this is good enough.

> The go compilers have inlining turned on by default; isRecording()
> should be inlined when used like:
>
> if isRecording() {
> ...
> }

Just because a function has been inlined does not mean it is atomic.
And it's probably not a good idea to depend on the compiler to inline
in any case.

- Evan

Vanja Pejovic

unread,
Feb 16, 2012, 6:50:25 PM2/16/12
to Bobby Powers, Patrick Mylund Nielsen, Henrik Johansson, golang-nuts, minux, Jake Brukhman
I don't know enough to agree or disagree with that, but even if you are right, and you do

if recording > 0 {
   record(v)
}

recording could be 1
then you evaluate the if statement, and go into the block,
then switch context and change recording to 0
then switch back and record(v). You've recorded while recording is 0.

I don't think that it's a big deal in this case at all since it just means missing a single value or recording one you weren't supposed to, but the only way (I think) to be absolutely precise is to have the check and the record (or at least the dispatch to record) in the same mutex block.

Vanja Pejovic

unread,
Feb 16, 2012, 6:54:59 PM2/16/12
to Evan Shaw, Bobby Powers, minux, Jake Brukhman, Patrick Mylund Nielsen, golang-nuts
About the flag corruption, how would go behave if he just did simple unprotected reads and writes on a boolean value? I don't think it would be possible to corrupt the value, but is it possible that one thread would never see the other's updates because of independent caches?  (assuming GOMAXPROCS > 1 and the two goroutines are always happen to map to their own threads) 

Patrick Mylund Nielsen

unread,
Feb 16, 2012, 7:03:09 PM2/16/12
to Vanja Pejovic, Evan Shaw, Bobby Powers, minux, Jake Brukhman, golang-nuts
As long as you don't explicitly synchronize, there is no guarantee
that changes will be observed:
http://golang.org/doc/go_mem.html#tmp_133

Adam Logghe

unread,
Feb 16, 2012, 7:04:21 PM2/16/12
to golan...@googlegroups.com
Hehe... the more I read of this thread the more I thought "hmmm I think I'd try a sync startrecord channel that the main thread select'ed on"

But that's because this stuff makes my head hurt.

Adam

Patrick Mylund Nielsen

unread,
Feb 16, 2012, 7:11:58 PM2/16/12
to Evan Shaw, Bobby Powers, minux, Vanja Pejovic, Jake Brukhman, golang-nuts
Isn't this essentially the same as using a sync.Mutex, only less
readable? Do the CMPXCHG instructions carry any appreciable
performance benefit?

Bobby Powers

unread,
Feb 16, 2012, 7:53:19 PM2/16/12
to Vanja Pejovic, Patrick Mylund Nielsen, Henrik Johansson, golang-nuts, minux, Jake Brukhman
The OP asked for the equivalent of Java's AtomicBoolean, which are the
functions in sync/atomic. And, really, I don't think that locking for
the entire duration of record() is actually what you want.

func record() {
...
}

if recording > 0 {
record()
}

can be rewritten as

func record() {
if recording <= 0 {
return
}
...
}

In this case, all putting a mutex around record() will do is to cause
the writer to sleep until record() is done, it won't cancel the
execution of record(). And if its not going to cancel record(), then
the only time you care about the value of recording is at the
beginning of the function, when you're deciding whether to execute it
or not. It doesn't matter if you get unscheduled immediately after
you check.

To put it another way, consider your timeline, modified to use a mutex:

> Context 1 grabs the mutex


> recording could be 1
> then you evaluate the if statement, and go into the block,

> then switch to context 2 and change recording to 0. This thread sleeps, blocking on the mutex.
> then switch back and record(v). You've recorded while recording is sleeping, waiting to set recording to 0
> context 1 drops the mutex
> context 2 wakes and acquires mutex, setting recording to 0.

So a mutex hasn't actually changed the behavior of this situation at
all. Unless I'm missing something, or am wrong :)

yours,
Bobby

Patrick Mylund Nielsen

unread,
Feb 16, 2012, 8:53:18 PM2/16/12
to Evan Shaw, Bobby Powers, minux, Vanja Pejovic, Jake Brukhman, golang-nuts
To answer my own question using a shoddy test
(https://gist.github.com/1849411): Yes. (I figured, but I wasn't sure
if the difference was significant.)

# GOMAXPROCS=4 go test --bench=. --parallel=1
BenchmarkAtomicInt32-4 1000000000 0.03 ns/op
BenchmarkMutex-4 2000000000 0.05 ns/op
BenchmarkConcurrentAtomicInt32-4 1000000000 0.05 ns/op
BenchmarkConcurrentMutex-4 1000000000 0.56 ns/op

Each op is 2 million stores and reads, to be fair, but atomic _is_
much faster, and so probably preferable for the purpose we've
discussed (as long as you accept the race condition.)

Thanks Bobby and Evan!

Best,
Patrick

Jan Mercl

unread,
Feb 17, 2012, 2:49:41 AM2/17/12
to golan...@googlegroups.com
On Friday, February 17, 2012 2:53:18 AM UTC+1, Patrick Mylund Nielsen wrote:
To answer my own question using a shoddy test
(https://gist.github.com/1849411): Yes. (I figured, but I wasn't sure
if the difference was significant.)

# GOMAXPROCS=4 go test --bench=. --parallel=1
BenchmarkAtomicInt32-4                        1000000000 0.03 ns/op
BenchmarkMutex-4                                2000000000 0.05 ns/op
BenchmarkConcurrentAtomicInt32-4        1000000000 0.05 ns/op
BenchmarkConcurrentMutex-4                1000000000 0.56 ns/op

Each op is 2 million stores and reads, to be fair, but atomic _is_
much faster, and so probably preferable for the purpose we've
discussed (as long as you accept the race condition.)

I'm sorry to be harsh, but all the numbers above are garbage*. Also no, 2 million stores and reads won't happen in 0.03 ns (light would travel only 4,5 nm per a store/read cycle in that 66 PHz CPU ;-)

(*) See discussion of b.N at top of http://goo.gl/9BTZ0

minux

unread,
Feb 17, 2012, 8:17:00 AM2/17/12
to Jan Mercl, golan...@googlegroups.com
The numbers doesn't mean that, it means it runs 1000000000 times the BenchmarkAtomicInt32, and the average time of ONE operation is 0.03ns.

Jan Mercl

unread,
Feb 17, 2012, 8:39:24 AM2/17/12
to golan...@googlegroups.com, Jan Mercl
On Friday, February 17, 2012 2:17:00 PM UTC+1, minux wrote:
Quoting from the source (empasizes are mine):

"BenchmarkAtomicInt32-4                        1000000000 0.03 ns/op

and

"Each op is 2 million stores and reads, to be fair, ..."

It's pretty unambiguous, how else it could be understood than: op == 2M R/Ws and op takes 0.03ns ? The garbage number follows from ignoring b.N, it just surprises me how such obviously insane number can get along w/o noticing it's beyond capabilities of any HW ever built?

Patrick Mylund Nielsen

unread,
Feb 17, 2012, 8:45:19 AM2/17/12
to Jan Mercl, golan...@googlegroups.com
I did say the test was shoddy, and yes, in this case optimization
works against me.

Please, rewrite the test to produce a result that isn't "garbage." :)

Patrick Mylund Nielsen

unread,
Feb 17, 2012, 8:54:25 AM2/17/12
to Jan Mercl, golan...@googlegroups.com
Using b.N (https://gist.github.com/1849411):

BenchmarkAtomicInt32-4 50000000 31.1 ns/op
BenchmarkMutex-4 20000000 90.3 ns/op
BenchmarkConcurrentAtomicInt32-4 5000000 530 ns/op
BenchmarkConcurrentMutex-4 500000 5478 ns/op

So, the conclusion is the same.

Thanks Jan.

Jan Mercl

unread,
Feb 17, 2012, 9:00:20 AM2/17/12
to golan...@googlegroups.com, Jan Mercl
On Friday, February 17, 2012 2:45:19 PM UTC+1, Patrick Mylund Nielsen wrote:
I did say the test was shoddy, and yes, in this case optimization
works against me.

Please, rewrite the test to produce a result that isn't "garbage." :)

Why if you've already done that? ;-)

Hera are the numbers from my machine (Intel E5400 @ 2.7 Ghz, 2 cores only):

14:57:35 jnml@shinigami:~/src/tmp$ GOMAXPROCS=2 go test -bench B -parallel 1
testing: warning: no tests to run
PASS
BenchmarkAtomicInt32-2 50000000        39.7 ns/op
BenchmarkMutex-2 10000000       149 ns/op
BenchmarkConcurrentAtomicInt32-2 5000000       443 ns/op
BenchmarkConcurrentMutex-2 1000000      3501 ns/op
ok   tmp 9.859s
14:57:55 jnml@shinigami:~/src/tmp$ 

----
PS: op is 2 set/get pairs in 'one' and 'mone', more complex in 'conc'.

Patrick Mylund Nielsen

unread,
Feb 17, 2012, 9:11:33 AM2/17/12
to Jan Mercl, golan...@googlegroups.com
Yes, but since b.N is varied until the result is reliable, it's much,
much more than my initial million calls. If you add fmt.Println(b.N)
to the funcs, you see that it runs 2xR/W 50,000,000 times before
determining it's reliable.

So my first results were "correct", they just didn't run long enough.
I'm curious how big of a gap optimization makes between each, though,
and how to test something like this reliably.

minux

unread,
Feb 17, 2012, 9:17:07 AM2/17/12
to Jan Mercl, golan...@googlegroups.com
Sorry, I failed to notice that the test program actually aren't using b.N.

Patrick Mylund Nielsen

unread,
Feb 17, 2012, 9:22:23 AM2/17/12
to Jan Mercl, golan...@googlegroups.com
Also, note that the numbers are the speed of one iteration through the
for loop, not the time it takes to perform the entire 1,000,000 or
50,000,000 iterations. (My comment insinuating that each "op" was more
than one was erroneous.)

Vanja Pejovic

unread,
Feb 17, 2012, 10:34:40 AM2/17/12
to Bobby Powers, Patrick Mylund Nielsen, Henrik Johansson, golang-nuts, minux, Jake Brukhman
I agree, the mutex shouldn't be around the actual work of recording. If I wanted recording to record a precise set of values, I would code it like this:

var (
       rec bool
       mu = &sync.Mutex{}
)

func maybeRecord(v V) {
       mu.Lock()
       defer mu.Unlock()
       if rec {
              toRecord <- v
       }
}

func setRecording(s bool) {
       mu.Lock()
       defer mu.Unlock()
       rec = s
}

...
for {
      v := <-data
      rmaybeRecord(v)
      processData(v)
 }

So what happens now is:

> Context 1 grabs the mutex
> rec is true
> then you evaluate the if statement in maybeRecord(), and go into the block
> then switch to context 2 which goes into setRecording and blocks on the mutex without changing rec
> then switch back and send the value on a channel to a goroutine which does all the actual writes.

> context 1 drops the mutex
> context 2 wakes and acquires mutex, setting recording to false

The same works if you do the recording synchronously instead of sending the value on a channel.

Reply all
Reply to author
Forward
0 new messages