Multiple-reader single-writer map access - is this lockless workaround safe?

2,132 views
Skip to first unread message

sqweek E.

unread,
Sep 12, 2016, 11:04:39 AM9/12/16
to golang-nuts
Ok so after getting away with using maps to share data between threads for a loooong time, I've finally seen the light and accepted that this is not safe.

In my case the threads in question are a painter thread and a mouse/keyboard event processing thread. They want to share data because obviously a mouse hitbox needs to match what is being drawn on the screen, unless you're going for a ux that drives your users insane :)

Anyway as I said it's not safe to concurrently read and write a map; eventually this will result in a panic. Yet I'm stuck with a codebase that does this all over the place. So despite hating the thought of blocking either thread I did what any honest java-for-dayjob coder would do and resigned myself to hiding the maps behind an RWLock.

Of course things are never that simple; adding locks to code that was not designed with them in mind is a good way to introduce deadlocks (particularly of the recursive-read-lock-while-other-thread-is-acquiring-write-lock variety). I made another change to address recursive lock acquisition, realised I missed a couple of shared maps and suddenly locking code was strewn all over my program; I got to the point of acquiring a read-lock every time the mouse moved before giving up in disgust. I thought about using an actual concurrent map, but giving up the builtin map's ease of use and well-typedness didn't thrill me. After some thought I came up with an alternative approach...

TLDR; My shared structures are *mostly* read-only, in the sense that read operations are much more common than writes. I decided I can afford to change the writes so instead of updating the maps in place, they (a) take a copy of the current map (b) update the copy and (c) replace the reference to the original map (held in a struct field) with the updated copy.

I was pretty chuffed with this approach. It's clearly not the most efficient but it allowed me to solve the panic without risking deadlock. I'd be surprised if I'm the first person to come up with it, but I haven't seen it mentioned - the majority of advice seems to be "use an RWLock" or "use a concurrent map".

I'm interested in understanding how safe/portable the approach is. I'm no expert on ARM/memory barriers but my understanding is that without an explicit sync/channel operation there's no guarantee another thread will see a change (in this case, the new map instance). In my case I have a channel based notification mechanism running alongside to wake up the painter thread when something changes, I'm just not sure what memory changes that is guaranteed to communicate.

I also just wanted to share the approach and see how other people feel about it. It reminds me a bit of a double-buffering - writing to the backbuffer and then swapping it into action.

-sqweek

Peter Bourgon

unread,
Sep 12, 2016, 11:21:05 AM9/12/16
to sqweek E., golang-nuts
How are you replacing the reference to the original map? If you're
just using plain assignment e.g. x.myMap = newMap then that's not
safe. You could use sync/atomic.Value if you always made sure to do
map reads via the Load method.
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

pierre...@gmail.com

unread,
Sep 12, 2016, 11:59:39 AM9/12/16
to golang-nuts
You might be interested in this:

sqweek E.

unread,
Sep 12, 2016, 12:04:30 PM9/12/16
to golang-nuts, sqw...@gmail.com, pe...@bourgon.org
Yes, through plain assignment. What problems arise from that?

Caleb Spare

unread,
Sep 12, 2016, 12:07:28 PM9/12/16
to sqweek E., golang-nuts, pe...@bourgon.org
That's still a data race. Are you using the race detector?

Peter Bourgon

unread,
Sep 12, 2016, 12:24:09 PM9/12/16
to sqweek E., golang-nuts
All memory operations are unsafe for concurrent goroutines unless
explicitly noted otherwise. In practice, if you have multiple
goroutines accessing the same memory, you need to protect it with the
primitives available to you in package sync or sync/atomic. Please
review https://golang.org/ref/mem for more details.

Many Go programmers opt to sidestep these sorts of problems by using
channels to orchestrate behavior and/or the flow of data between
goroutines — i.e. not communicating by sharing memory, but rather
sharing memory by communicating.

adon...@google.com

unread,
Sep 12, 2016, 5:28:37 PM9/12/16
to golang-nuts, sqw...@gmail.com, pe...@bourgon.org
On Monday, 12 September 2016 12:04:30 UTC-4, sqweek E. wrote:
Yes, through plain assignment. What problems arise from that?

Here's the general form of the problem.  In the code below, one goroutine (f) creates a variable, initializes it (x=1), then shares its address with another goroutine (in this case by assigning to a global variable, but that's a detail).  Another goroutine (g) reads the pointer out of the global variable and inspects the variable to which it points.

type T struct { x int }

var global *T

func f() {
    p := new(T)
    p.x = 1
    global = p // "publish" the new T (racy!)
}

func g() {
    p = global
    if p != nil {
        println(p.x) // may print "0" due to data race
    }
}

go f()
go g()

One might naively think that the print statement always prints 1, but in fact it can print 0.  The reason is that the compiler or CPU is free to reorder the assignments p.x=1 and global=p, since f can't tell.  But reordering means other goroutines might observe a non-nil pointer in global before the thing it points to has been fully initialized.

To safely "publish" a variable initialized by one goroutine so that other goroutines see it properly initialized, you need "barrier semantics", that is, you need to tell the compiler and CPU not to reorder those assignments. That's what atomic.Value does.

But you shouldn't reach for atomic.Value without data.  In most programs, a mutex is simpler to reason about and plenty fast enough.

sqweek E.

unread,
Sep 12, 2016, 9:41:43 PM9/12/16
to golang-nuts, sqw...@gmail.com, pe...@bourgon.org
Thanks all. "Plain assignment" is not quite the full story actually - as I mentioned originally there's also a channel in use to wake up the painter thread. So in terms of memory model what I'm doing is more like:

    x.myMap = newMap
    x.refresh <- nil // i don't think it matters what value is sent across the channel?

Also I don't mind a bit of racy non-determinism. If the painter thread happens to be actively painting when I do this update I don't care if it notices the newMap immediately or not, just as long as it sees the update when painting the next frame. That will be after it has drained the refresh channel, so should be guaranteed by the memory model if I've understood correctly.

For the general case sync.Value sounds appropriate. It means touching all the reader code aswell, but in a fairly trivial way.

Thanks again!
-sqweek

Caleb Spare

unread,
Sep 12, 2016, 9:48:06 PM9/12/16
to sqweek E., golang-nuts
If you have a data race, you lose a lot of guarantees about the
correct operation of your program. It may work today but fail
occasionally, or a future version of the compiler may break it very
badly.

See https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong.

Most Go programmers consider programs with data races to be logically
incorrect / broken.

-Caleb

Paul Borman

unread,
Sep 13, 2016, 10:34:12 AM9/13/16
to Caleb Spare, sqweek E., golang-nuts
I think it is pretty common to make your map/list "read only" and to update it by a replace.  That does not get you away from the need for a mutex, however.   There is still a reader and writer:

Reader:
v := m[key]

Writer:
m = newMap

Those two operations are not safe together.  You still need a mutex:

mu.Lock()
mc := m
mu.Unlock()
v := mc[key]

mu.Lock()
m = newMap
mu.Unlock()

Of course, if you are not going to fetch multiple values from mc, it degenerates to:

mu.Lock()
v := m[key]
mu.Unlock()

Which is what you are trying to avoid.

That said, a map is represented by a single machine word (32 or 64 bits).  I don't know of any modern architecture where reading a cache aligned word while it is being written will yield anything but the old or new value.  The Go memory model does not guarantee this, however, so while it might work for you now on your current target, that is a side effect of an implementation and architecture detail.

There are some architectures that allow non-aligned reads/writes where a machine word could span cache lines, but that is very expensive and I do not believe Go ever generates code like that, though I guess it could be possible.

    -Paul


> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.

Alan Donovan

unread,
Sep 13, 2016, 10:57:47 AM9/13/16
to Paul Borman, Caleb Spare, sqweek E., golang-nuts
On 13 September 2016 at 10:33, 'Paul Borman' via golang-nuts <golan...@googlegroups.com> wrote:
That said, a map is represented by a single machine word (32 or 64 bits).  I don't know of any modern architecture where reading a cache aligned word while it is being written will yield anything but the old or new value.

This line of reasoning is what leads to trouble.  The atomicity problem is not with the map reference, which will be either old or new, but with all the memory writes to the actual hash table, which may be in any number of intermediate states including "impossible" ones.  You must ensure that the completion of those writes happens before another goroutine attempts to read from the hashtable, and for that, you need some kind of barrier: a channel, mutex, or atomic.Value.

Paul Borman

unread,
Sep 13, 2016, 11:31:09 AM9/13/16
to Alan Donovan, Caleb Spare, sqweek E., golang-nuts
Yes, that is very true, though in the OP's case, the memory barrier can easily be done by the writer without affecting the reader (simply by switching the write to the channel with the assignment of x.myMap).  Granted, the reader might have stale information, but I believe that the compiler currently will only cache the pointer to the map with relative locality, so for practical (but not guaranteed) purposes, it will not appear to be a problem.

But I will reiterate what I said:  The Go memory model does not guarantee this, however, so while it might work for you now on your current target, that is a side effect of an implementation and architecture detail.


sqweek E.

unread,
Sep 13, 2016, 12:36:54 PM9/13/16
to golang-nuts, sqw...@gmail.com
On Tuesday, September 13, 2016 at 9:48:06 AM UTC+8, Caleb Spare wrote:
See https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong.

I've read this article before and the second reading hasn't done much to improve my impression. The premise in the first example, that multiple threads updating the same memory location in a non-atomic fashion constitutes "innocent" code, is about as strawman as they come. Forget the data race, there's an obvious race condition here! And unless C++ atomic operations are wildly different from those I have experience with (java mainly), the suggested "correct" code fixes only the data race and not the race condition.

The remaining examples sound indistinguishable from compiler bugs. You have multiple threads concurrently accessing a 'stop' variable, and yet the compiler believes that "the stop variable can't be accessed concurrently by other threads"? Might want to consider some escape analysis before making that assumption...

You define a function-local variable intended to hold a copy of some shared memory, and the compiler decides to read the shared memory twice? Thanks buddy!

The last one takes the cake - you read from a memory location and the compiler TURNS IT INTO A WRITE??? And problems emerging from all of these get blamed on the programmer's "racy code" instead of a broken compiler?

W. T. F.
 
Most Go programmers consider programs with data races to be logically
incorrect / broken.

Maybe there are good reasons to consider them such, but this article has not convinced me of any.
-sqweek

Alan Donovan

unread,
Sep 13, 2016, 12:47:29 PM9/13/16
to sqweek E., golang-nuts
On 13 September 2016 at 12:36, sqweek E. <sqw...@gmail.com> wrote:
On Tuesday, September 13, 2016 at 9:48:06 AM UTC+8, Caleb Spare wrote:
See https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong.

I've read this article before and the second reading hasn't done much to improve my impression. The premise in the first example, that multiple threads updating the same memory location in a non-atomic fashion constitutes "innocent" code, is about as strawman as they come. Forget the data race, there's an obvious race condition here! And unless C++ atomic operations are wildly different from those I have experience with (java mainly), the suggested "correct" code fixes only the data race and not the race condition.

I don't think it's a straw man.  Many novice programmers wrongly assume that the behavior of a concurrent program can be understood as an arbitrary interleaving of the statements of the threads of that program, and this is almost true for a few languages, such as Python.  The discovery that it cannot may be profoundly surprising.


And problems emerging from all of these get blamed on the programmer's "racy code" instead of a broken compiler?

The authors of the Go spec tend to take a more user-friendly line than the C standard.  For example, the Go memory model says that the value observed by a racy read of a single-word variable is not arbitrary, but must equal one of the values previously or concurrently stored there. In contrast, C doesn't define this behavior at all.  Whether C-based compilers such as gccgo or llgo actually implement this behavior is an interesting question.

sqweek E.

unread,
Sep 13, 2016, 1:07:56 PM9/13/16
to golang-nuts, bor...@google.com, ces...@gmail.com, sqw...@gmail.com, adon...@google.com
Thank you for spelling this out! To put it into concrete terms, the hash table is likely spread over multiple cache lines and so if a goroutine executing on a processor other than the one which built the hash table was to load the cache line containing the new map reference and access it before loading the actual hash table cache lines that would spell disaster.

I still kind of feel like a memory barrier after updating the map reference will be sufficient in practice; it doesn't seem likely that another processor would see the new map reference before the memory barrier caused the rest of the hash tables to also be visible? But I'm no x86 export let alone ARM or less common architectures, so the boat is certainly starting to leak.

Lesson learned; use barriers if sharing memory.
-sqweek

Henrik Johansson

unread,
Sep 13, 2016, 2:58:24 PM9/13/16
to sqweek E., golang-nuts, bor...@google.com, ces...@gmail.com, adon...@google.com

I think this will be an interesting read especially if you come from Java.

https://shipilev.net/blog/2016/close-encounters-of-jmm-kind/


--

Ian Lance Taylor

unread,
Sep 13, 2016, 3:07:08 PM9/13/16
to sqweek E., golang-nuts
On Tue, Sep 13, 2016 at 9:36 AM, sqweek E. <sqw...@gmail.com> wrote:
>
> The remaining examples sound indistinguishable from compiler bugs.

Developers of C/C++ compilers rigorously rely on the language
standard's definition of undefined behavior when optimizing programs.
This can lead to many behaviors that, when first encountered, are
often denounced as compiler bugs. But the compiler's only promise is
to implement the language standard.

For example, until C++11 clarified that speculative stores were
forbidden, C/C++ compilers really did generate them, and it really did
baffle and confuse people and break programs including the Linux
kernel, but it was not wrong according to the then-relevant language
standards. There are many similar cases. It's easier to say that the
compiler is broken than it is to point to the rules that explain why
that is the case.

Ian

sqweek E.

unread,
Sep 15, 2016, 11:47:03 AM9/15/16
to golang-nuts, sqw...@gmail.com, bor...@google.com, ces...@gmail.com, adon...@google.com
On Wednesday, September 14, 2016 at 2:58:24 AM UTC+8, Henrik Johansson wrote:

I think this will be an interesting read especially if you come from Java.

https://shipilev.net/blog/2016/close-encounters-of-jmm-kind/


There were some surprises in there! With a bit more perspective I can see that my criticisms of the other data race article in this thread were grounded in an "ideal" reality, where the compiler exists solely to provide a good user experience to programmers. eg. via abstractions that don't leak details about the underlying machine's memory semantics, or verifying that an optimisation's prerequisites actually hold before applying it.

Of course the reality we inhabit includes compilers which are blind to data races, since a data race is defined by their language spec as causing undefined behaviour. It's mildly unfortunate that a program containing undefined behaviour is allowed to compile (tip of the hat to rust for going the extra mile here), but with C inhabiting a space intentionally close to the machine and having a long history prior to threads becoming widely used it is understandable.

Thanks everyone for helping me learn! If you'll humour me a moment longer, I'm curious about the atomic.Value approach. If you have a single writer:

newMap := make(map[int]int)
newMap
[4] = 2
newMap
[9] = 3
x
.myMap.Store(newMap) //x.myMap is an atomic.Value

And a reader:

table := x.myMap.Load().(map[int]int)
fmt
.Println(table[9])

Does go's memory model ensure that the read of table[9] sees the value the writer put there (as long as the Load happens after the Store)?

If so, how does the compiler/runtime know that table[9] is a shared memory location requiring properly publishing when it hasn't been explicitly involved in any kind of barrier operation?

If the writer/reader use a channel instead of atomic.Value, does it change anything? ie.

x.myMap = newMap //x.myMap is a map[int]int
x
.changed <- nil

<-x.changed
table := x.myMap
fmt.Println(table[9])

I guess in this case the scheduler could arrange for the reader to run on the same processor as the writer did and avoid worrying about cache inconsistencies? But I also see an opportunity here for the writer to update the map again, after the reader has received on the channel but before it has read from x.myMap. Which means the read could see the second update to the map whose underlying hashtable may not be in a consistent state when observed by the reader?

(the sensible thing would be to send the map over the channel, but for the purpose of thought experiment...)
-sqweek

Ian Lance Taylor

unread,
Sep 15, 2016, 12:13:59 PM9/15/16
to sqweek E., golang-nuts, Paul Borman, Caleb Spare, Alan Donovan
On Thu, Sep 15, 2016 at 8:47 AM, sqweek E. <sqw...@gmail.com> wrote:
>
> Thanks everyone for helping me learn! If you'll humour me a moment longer,
> I'm curious about the atomic.Value approach. If you have a single writer:
>
> newMap := make(map[int]int)
> newMap[4] = 2
> newMap[9] = 3
> x.myMap.Store(newMap) //x.myMap is an atomic.Value
>
> And a reader:
>
> table := x.myMap.Load().(map[int]int)
> fmt.Println(table[9])
>
> Does go's memory model ensure that the read of table[9] sees the value the
> writer put there (as long as the Load happens after the Store)?

Yes. (Well, Go's memory model doesn't actually discuss
sync/atomic.Value at all. But, if it did, it would make this
guarantee.)

> If so, how does the compiler/runtime know that table[9] is a shared memory
> location requiring properly publishing when it hasn't been explicitly
> involved in any kind of barrier operation?

Storing a value in a atomic.Value is a store-release operation, and
loading a value from an atomic.Value is a load-acquire operation.
That is, moving a value through atomic.Value ensures that all memory
writes done by the writing goroutine preceding the atomic.Value are
visible to the reading goroutine after reading from the atomic.Value.


> If the writer/reader use a channel instead of atomic.Value, does it change
> anything? ie.
>
> x.myMap = newMap //x.myMap is a map[int]int
> x.changed <- nil
>
> <-x.changed
> table := x.myMap
> fmt.Println(table[9])
>
> I guess in this case the scheduler could arrange for the reader to run on
> the same processor as the writer did and avoid worrying about cache
> inconsistencies? But I also see an opportunity here for the writer to update
> the map again, after the reader has received on the channel but before it
> has read from x.myMap. Which means the read could see the second update to
> the map whose underlying hashtable may not be in a consistent state when
> observed by the reader?

It's the same answer: writing a value to a channel is a store-release,
and reading a value is a load-acquire. Actually, channels might
require an ever stronger guarantee of sequential consistency. I'm not
sure off hand.

Ian

刘桂祥

unread,
Nov 3, 2016, 11:54:46 PM11/3/16
to golang-nuts, sqw...@gmail.com, pe...@bourgon.org, adon...@google.com
sorry could you provide a complete example ?

I try this  but not find question
package main


import "time"



type T
struct{ x int }


var global *T


func f
() {
 p
:= new(T)
 p
.x = 1
 
global = p // "publish" the new T (racy!)
}


func g
() {
 p
:= global
 
if p != nil {

 
// println(p.x) // may print "0" due to data race
 
if p.x == 0 {
 panic
("ca")
 
}
 
}
}
func main
() {
 
for i := 0; i < 10000; i++ {
 go func
() {
 
for j := 0; j < 1000; j++ {
 f
()
 
}
 
}()
 go func
() {
 
for j := 0; j < 1000; j++ {
 g
()
 
}
 
}()
 
}
 time
.Sleep(100 * time.Second)
}


在 2016年9月13日星期二 UTC+8上午5:28:37,adon...@google.com写道:

adon...@google.com

unread,
Nov 4, 2016, 9:14:01 AM11/4/16
to golang-nuts, sqw...@gmail.com, pe...@bourgon.org, adon...@google.com
I think you are saying you failed to observe a data race during execution of this program.  It is not safe to conclude that a program does not have a data race just because you didn't notice it.  The behavior of a program containing a data race depends on many things, including the set of optimizations done by a compiler, and the architecture and microarchitecture of the processor.

If you build this program with the -race flag, the dynamic race detector will almost certainly detect and report a problem.
Reply all
Reply to author
Forward
0 new messages