Elements vs Setters and Getters

1,008 views
Skip to first unread message

Paul Borman

unread,
Oct 4, 2011, 5:26:29 PM10/4/11
to golang-nuts
Suppose you have a structure that includes a status that is a string.  Naively I would simply have:

type Data struct {
    Status string
}

Clearly more would be going on, but I am focussing on the Status element.  Now I can go and say:

    d := &Data{}
    ...
    fmt.Printf("d's status: %s\n", d.Status)

But this is not safe at all if I might be changing status of d in another goroutine.  The Printf can crash my program.

So, I end up with something like:

type Data struct {
    mu sync.Mutex
    status string
}

func (d *Data)Status string {
    d.mu.Lock()
    defer d.mu.Unlock()
    return d.status
}

func (d *Data)SetStatus(s string) {
    d.mu.Lock()
    defer d.mu.Unlock()
    d.status = s
}

I need to do this for some types and not others.  For example, if Status was an int I could just set it and read it.  Is this really the recommended way of dealing with objects that might be being used in multiple goroutines?  It just seems like a lot of boiler plate code and an easy to make, hard to catch, pitfall for go programs.

    -Paul

Han-Wen Nienhuys

unread,
Oct 4, 2011, 5:42:51 PM10/4/11
to Paul Borman, golang-nuts
On Tue, Oct 4, 2011 at 6:26 PM, Paul Borman <bor...@google.com> wrote:
> Suppose you have a structure that includes a status that is a string.
>  Naively I would simply have:
> type Data struct {
>     Status string
> }

Assuming the status strings themselves don't change, you could have

..
Status *string
..

have that point to different constants. That said, if Status has a
finite number of possibilitises, it's probably easier and more
efficient to do

type Status int
const (
OK = Status(0)
ENOENT = Status(22)
)

func (s Status) String() string {
..
}

--
Han-Wen Nienhuys
Google Engineering Belo Horizonte
han...@google.com

Paul Borman

unread,
Oct 4, 2011, 5:46:40 PM10/4/11
to Han-Wen Nienhuys, golang-nuts
That would be a solution to a specific use case, but not a general solution.  I am afraid I can't get away from Setters and Getters if I want to have concurrency.  In the actual use case the status can be an arbitrary string and is read over a connection from another program so I cannot enumerate it.

    -Paul

tux21b

unread,
Oct 4, 2011, 5:51:10 PM10/4/11
to golan...@googlegroups.com
There shouldn't be that many objects which are used by multiple goroutines in your code. It's better, much faster and less error-prone to use those objects only in one goroutine at a time (isolated mutability). Also, the general advice is: "Don't communicate by sharing memory; share memory by communicating." (Effective Go).

Also, it isn´t safe to read/write ints concurrently either. Use at least atomic.Store() and atomic.Load() to ensure that those variables are (1) changed atomically, (2) are visible to others (i.e. affect the RAM and not just a register or one of the many cpu caches) and (3) to ensure that those statements are not reordered by the compiler or the processor. As you see, shared mutability is, even with such a basic data type like ints, never that easy and cheap... avoid it!

In a few cases, it still might be better to use such objects which are protected by a mutex. But maybe you can still to reduce the amount of shared data by introducing a State object or something like that. Then it might be enough to fetch the State object concurrently, to get a snapshot of all mutable attributes at a specific time. (immutable datastructures and COW algorithms might help there).

-christoph

Uriel

unread,
Oct 4, 2011, 6:08:51 PM10/4/11
to Paul Borman, golang-nuts
On Tue, Oct 4, 2011 at 11:26 PM, Paul Borman <bor...@google.com> wrote:
> Is this really the recommended way of dealing with objects that might be being used in multiple goroutines?

The recommended way is not to access objects from multiple goroutines.
That is what channels are for, passing around the 'ownership' of
objects.

"Don't communicate by sharing memory; share memory by communicating."

uriel

John Asmuth

unread,
Oct 4, 2011, 7:00:22 PM10/4/11
to golan...@googlegroups.com, Paul Borman
Just some more detail on what I think a few of these guys are getting at...



type Stuff struct {
DataSet chan<- string
DataGet <-chan string
}

func NewStuff() *Stuff {
s := new(Stuff)
set, get := make(chan string), make(chan string)
s.DataSet = set
s.DataGet = get
go func(set <-chan string, get chan<- string) {
var data string
for {
select {
case d := <- set:
data = d
case get <- data:
}
}
}(set, get)
return s
}

s.DataSet <- aNewValue
theValue <- s.DataGet

Russ Cox

unread,
Oct 4, 2011, 10:07:06 PM10/4/11
to Paul Borman, golang-nuts
On Tue, Oct 4, 2011 at 17:26, Paul Borman <bor...@google.com> wrote:
> For example, if Status was an int I could just set it and read it.

Only if you don't care what the value is. If you want the value
to be up to date you have to do some kind of synchronization
between goroutines, at least according to the memory model.

Of course, if you have a single-word field that is
independent of all the others, in the current Go implementations
it is okay to read and write it without a lock: you'll get a
relatively recent value, and your program won't crash.
However, the memory model doesn't guarantee either of
those important behaviors: it doesn't guarantee that the
compiler won't change to break one or both at some point in
the future. If either behavior is important to your program,
synchronize.

> Is [locked getters]


> really the recommended way of dealing with objects that might be being used
> in multiple goroutines?

If you must have such an object, then yes, you have to
synchronize. You can use mutexes if you want, or you
can use channels, like in John Asmuth's example.

> It just seems like a lot of boiler plate code and
> an easy to make, hard to catch, pitfall for go programs.

Saying it is boilerplate suggests that you want something
to generate it for you, as some other languages do.
One problem (of many) with automatically locked getters
and setters is that they provide a false sense of security.
Maybe in this case you really do have a field that is
unrelated to everything else in the data structure, so that
you can read it and write it independently of all other
memory. In that case, an auto-generated locking getter/setter
wrapper would be just perfect. However, as soon as you
have two or more fields that must be adjusted in tandem,
you need a lock around the whole update, or a lock around
the read of the entire set, in order to get a consistent snapshot.
That's harder to auto-generate.

Of course, one programmer's boilerplate is another
programmer's opportunity to write a function. If this kind of
atomic string field comes up over and over in your program,

type SyncString struct {
mu sync.RWMutex
val string
}

func (s *SyncString) Get() {
s.mu.RLock()
defer s.mu.RUnlock()
return s.val
}

func (s *SyncString) Set(x string) {
s.mu.Lock()
defer s.mu.Unlock()
s.val = x
}

Russ

Lars Pensjö

unread,
Oct 5, 2011, 5:17:08 AM10/5/11
to golan...@googlegroups.com
There is also the risk with this implementation that someone (human mistake) access the status field directly, not understanding the concurrent issues. This can lead to hard-to-find bugs. One way to solve that is to make a package of it, and making the status field private. That will enforce the use of getters/setters. Using interfaces may also be a way to do this.

A comment on getters/setters (which is not entirely relevant to the issue of concurrency), is that instead of having a get and set operation that just use the struct as a dumb container, it may be better to replace the getter with a function that does some work that depends on the state. Instead of reading out the state somewhere else, and do the work at that place. Of course, this depends on the design.

Paul Borman

unread,
Oct 5, 2011, 2:43:56 PM10/5/11
to golan...@googlegroups.com
In this particular case I just need to make sure that I don't read a corrupt string structure and crash.  It doesn't matter if I return the one just before or just after someone else is changing it.  I just need to not crash.  Of course, there is expectation that the value is not too stale.

Russ's suggestion of basically building thread safe strings (and maps and slices) is not a bad solution but has the cost of a mutex per element.  The common wisdom of "well, if you need to protect a map you probably need to protect something else" does suggest a lock for the collection of data rather than just the map but at a cost of a new method per element.

It perhaps is an unfair association of mine.  If the language is going to tell me it is going to handle all my memory allocation/freeing (i.e., garbage collection) then I start to feel short changed if it is going to claim concurrency but make me do the equivalent of alloc/free for concurrency (lock/unlock).  Not enough friction on my slope, I guess.

Lars Pensjö

unread,
Oct 5, 2011, 3:49:23 PM10/5/11
to golan...@googlegroups.com, Paul Borman


On Wednesday, 5 October 2011 00:08:51 UTC+2, Uriel K wrote:

The recommended way is not to access objects from multiple goroutines.
That is what channels are for, passing around the 'ownership' of
objects.

"Don't communicate by sharing memory; share memory by communicating."

It is a very nice way of doing things, but it doesn't work well if the number of readers are much bigger than the number of writers. They would be forced to wait, only allowing access one at a time.

It may also be a problem if performance is important. In a test I did (a little synthetic), I got the following times:

atomic.LoadUint32: 2.5 ns
Mutex RLock+RUnlock: 18 ns
Mutex Lock+Unlock: 38 ns
Write+read uint32 on buffered channel, same goroutine: 64 ns
 

Lars Pensjö

unread,
Oct 5, 2011, 3:59:38 PM10/5/11
to golan...@googlegroups.com
On Wednesday, 5 October 2011 20:43:56 UTC+2, Paul Borman wrote:
In this particular case I just need to make sure that I don't read a corrupt string structure and crash.  It doesn't matter if I return the one just before or just after someone else is changing it.  I just need to not crash.  Of course, there is expectation that the value is not too stale.

Russ's suggestion of basically building thread safe strings (and maps and slices) is not a bad solution but has the cost of a mutex per element.  The common wisdom of "well, if you need to protect a map you probably need to protect something else" does suggest a lock for the collection of data rather than just the map but at a cost of a new method per element.
 
There is an alternative to the suggestion of Han-Wen Nienhuys. Use a pointer to strings, and allocate a new string every time it changes. That way, you only update the pointer. It is a variant of the "Copy on Write" algorithm. That solution can be extended to bigger data sets than a string. If I understand correctly what Russ wrote above, the read and write of a pointer is an atomic operation currently (as it is a single word), but maybe you can't trust it to always remain that way. The draw back is that frequent updates will waste memory, forcing the garbage collection to work more.

If you don't want to trust reading and writing of pointers as atomic, you can use the new atomic functions that support read and write of pointers.

Ian Lance Taylor

unread,
Oct 5, 2011, 4:01:49 PM10/5/11
to Paul Borman, golan...@googlegroups.com
Paul Borman <bor...@google.com> writes:

> In this particular case I just need to make sure that I don't read a corrupt
> string structure and crash. It doesn't matter if I return the one just
> before or just after someone else is changing it. I just need to not crash.
> Of course, there is expectation that the value is not too stale.
>
> Russ's suggestion of basically building thread safe strings (and maps and
> slices) is not a bad solution but has the cost of a mutex per element. The
> common wisdom of "well, if you need to protect a map you probably need to
> protect something else" does suggest a lock for the collection of data
> rather than just the map but at a cost of a new method per element.
>
> It perhaps is an unfair association of mine. If the language is going to
> tell me it is going to handle all my memory allocation/freeing (i.e.,
> garbage collection) then I start to feel short changed if it is going to
> claim concurrency but make me do the equivalent of alloc/free for
> concurrency (lock/unlock). Not enough friction on my slope, I guess.

We understand how to use garbage collection to more or less handle all
memory allocation issues. As somebody else said in a different way, we
don't understand how to handle all concurrency issues. The reason we
don't understand them is that any comprehensible approach requires some
notion of what collection of changes must be handled atomically. If
there are several associated fields that must all be changed as a unit,
then it doesn't help to make each individual field change atomic.

Go takes a nonintrusive approach, in that you have to specify the
locking yourself. Because that is hard to do correctly, Go strongly
encourages having a single owner for each data item. If a data item has
a single owner, there are no concurrency issues. But sometimes that
doesn't work, and then you need to use locks.

Other languages take different approaches. There is no clearly correct
approach in this space, at least not yet.

Ian

Ian Lance Taylor

unread,
Oct 5, 2011, 4:52:04 PM10/5/11
to golan...@googlegroups.com, Paul Borman
Lars Pensjö <lars....@gmail.com> writes:

> On Wednesday, 5 October 2011 00:08:51 UTC+2, Uriel K wrote:
>>
>> "Don't communicate by sharing memory; share memory by communicating."
>>
> It is a very nice way of doing things, but it doesn't work well if the
> number of readers are much bigger than the number of writers. They would be
> forced to wait, only allowing access one at a time.

It really depends on how your data is structured and how your program is
structured. It's hard to discuss the idea abstractly without concrete
details.

E.g., when there are many readers and few writers, it is often the case
that an update need not be pushed atomically to all readers. When that
is the case, each reader can have a channel which it checks via select
when it is ready to use an updated value. Then writes could go through
a single goroutine which would then push out the updated value via
channels to all readers. This is just an example of an alternate
approach which will work in some cases.

Ian

Russ Cox

unread,
Oct 5, 2011, 5:31:09 PM10/5/11
to Paul Borman, golan...@googlegroups.com
We might be able to give you more specific
answers if you point to the actual code.

Russ

Reply all
Reply to author
Forward
0 new messages