maps and thread safety

1,542 views
Skip to first unread message

Lars Pensjö

unread,
Oct 1, 2011, 6:47:55 AM10/1/11
to golan...@googlegroups.com
I know that maps are not thread safe. But how unsafe are they? Can a map get into an illegal state where it can't be used?

Even worse, can the memory allocation get into illegal state? Can a pointer be partly updated, leading to illegal pointers?

This is also a general question on many of the language functionality and runtime library. For a runtime library, I expect that a package can be broken (getting an illegal state) from parallel threads, unless explicitly stated otherwise. The reason I am asking is that you can sometimes skip the use of semaphores for an object if the algorithm can handle random asynchronous changes of that object. But that would only be possible if the basic programming language mechanisms at least keep internal "state safety".

There is the sync/atomic package, which seems to indicate that almost anything can indeed be broken from parallel threads.

Dmitry Vyukov

unread,
Oct 1, 2011, 6:54:13 AM10/1/11
to golan...@googlegroups.com
On Sat, Oct 1, 2011 at 2:47 PM, Lars Pensjö <lars....@gmail.com> wrote:
I know that maps are not thread safe. But how unsafe are they?

sufficiently
 
Can a map get into an illegal state where it can't be used?

yes 

Even worse, can the memory allocation get into illegal state?

yes
 
Can a pointer be partly updated, leading to illegal pointers?

yes
 
This is also a general question on many of the language functionality and runtime library. For a runtime library, I expect that a package can be broken (getting an illegal state) from parallel threads, unless explicitly stated otherwise. The reason I am asking is that you can sometimes skip the use of semaphores for an object if the algorithm can handle random asynchronous changes of that object.

Such objects called thread-safe, map/slice/string are not.
 
But that would only be possible if the basic programming language mechanisms at least keep internal "state safety".

There is the sync/atomic package, which seems to indicate that almost anything can indeed be broken from parallel threads.

yes

unread,
Oct 1, 2011, 7:38:06 AM10/1/11
to golang-nuts
On Oct 1, 12:54 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> On Sat, Oct 1, 2011 at 2:47 PM, Lars Pensjö <lars.pen...@gmail.com> wrote:
> > This is also a general question on many of the language functionality and
> > runtime library. For a runtime library, I expect that a package can be
> > broken (getting an illegal state) from parallel threads, unless explicitly
> > stated otherwise. The reason I am asking is that you can sometimes skip the
> > use of semaphores for an object if the algorithm can handle random
> > asynchronous changes of that object.
>
> Such objects called thread-safe, map/slice/string are not.

Go strings are fully thread-safe (they are immutable).

Slices are thread-safe as long as they refer to non-overlapping memory
regions, even when multiple slices are backed by the same array.

Map is the most problematic case. They are (most likely, somebody
please confirm this) thread-safe as long as you are not adding
anything to the map: concurrently overwriting already existing values
is probably safe as long as the keys are different (this is similar to
a slice). So this should work:

m := make(map[string]int)
m['a'] = 1
m['b'] = 2
go func(){ m['a'] = 3 }()
go func(){ m['b'] = 4 }()

unread,
Oct 1, 2011, 7:47:39 AM10/1/11
to golang-nuts
On Oct 1, 12:47 pm, Lars Pensjö <lars.pen...@gmail.com> wrote:
> Even worse, can the memory allocation get into illegal state? Can a pointer
> be partly updated, leading to illegal pointers?

Do you mean a pointer such as "*int"?

Dmitry Vyukov

unread,
Oct 1, 2011, 7:51:35 AM10/1/11
to ⚛, golang-nuts
On Sat, Oct 1, 2011 at 3:38 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
On Oct 1, 12:54 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> On Sat, Oct 1, 2011 at 2:47 PM, Lars Pensjö <lars.pen...@gmail.com> wrote:
> > This is also a general question on many of the language functionality and
> > runtime library. For a runtime library, I expect that a package can be
> > broken (getting an illegal state) from parallel threads, unless explicitly
> > stated otherwise. The reason I am asking is that you can sometimes skip the
> > use of semaphores for an object if the algorithm can handle random
> > asynchronous changes of that object.
>
> Such objects called thread-safe, map/slice/string are not.

Go strings are fully thread-safe (they are immutable).

Of course they are not. Neither thread-safe nor immutable.
Never ever do this:

type Person struct {
  name string
  ...
}

go func() { person.name = newName }()
go func() { fmt.Printf("%s", person.name) }()

It breaks badly.




Slices are thread-safe as long as they refer to non-overlapping memory
regions, even when multiple slices are backed by the same array.

The same as strings.

 

Map is the most problematic case. They are (most likely, somebody
please confirm this) thread-safe as long as you are not adding
anything to the map: concurrently overwriting already existing values
is probably safe as long as the keys are different (this is similar to
a slice). So this should work:

m := make(map[string]int)
m['a'] = 1
m['b'] = 2
go func(){ m['a'] = 3 }()
go func(){ m['b'] = 4 }()

Just don't run hg update ;)

Lars Pensjö

unread,
Oct 1, 2011, 8:24:57 AM10/1/11
to golan...@googlegroups.com
Thanks Dmitry for the information!


On Saturday, 1 October 2011 12:54:13 UTC+2, Dmitry Vyukov wrote:
On Sat, Oct 1, 2011 at 2:47 PM, Lars Pensjö <lars....@gmail.com> wrote:
[Can parallel threads break data?]
yes 
[...]
yes
[...]
yes
[...]
yes

Hm, I think I get the point...

Using the sync/atomic package, you can do some kind of "atomic" read and write. But what are the rules? That is, if you do atomic.Storeint32(), are you then required to do atomic.Loadint32() to be guaranteed consistency?

Are the atomic operations implemented using semaphores (implying possible performance issues), or using instructions on the CPU that guarantees the atomic behavior? Maybe you are not supposed to rely on either, as it may be implementation specified.

But channels, they are thread-safe, aren't they?

The specification of Go doesn't mention the word "thread-safe", which could be interpreted as nothing is thread-safe (not even channels). Maybe a section would be needed to clearly state what to expect (or there already is, but I missed it).

unread,
Oct 1, 2011, 8:35:48 AM10/1/11
to golang-nuts
On Oct 1, 1:51 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> On Sat, Oct 1, 2011 at 3:38 PM, ⚛ <0xe2.0x9a.0...@gmail.com> wrote:
> > On Oct 1, 12:54 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> > > On Sat, Oct 1, 2011 at 2:47 PM, Lars Pensjö <lars.pen...@gmail.com>
> > wrote:
> > > > This is also a general question on many of the language functionality
> > and
> > > > runtime library. For a runtime library, I expect that a package can be
> > > > broken (getting an illegal state) from parallel threads, unless
> > explicitly
> > > > stated otherwise. The reason I am asking is that you can sometimes skip
> > the
> > > > use of semaphores for an object if the algorithm can handle random
> > > > asynchronous changes of that object.
>
> > > Such objects called thread-safe, map/slice/string are not.
>
> > Go strings are fully thread-safe (they are immutable).
>
> Of course they are not. Neither thread-safe nor immutable.
> Never ever do this:
>
> type Person struct {
>   name string
>   ...
>
> }
>
> go func() { person.name = newName }()
> go func() { fmt.Printf("%s", person.name) }()
>
> It breaks badly.

Also incorrect (?):

type S struct { a byte }
var s S
go func() { s.a = 10 }
go func() { println(s.a) }

... since the byte might be read/written by accessing individual bits.

> > Slices are thread-safe as long as they refer to non-overlapping memory
> > regions, even when multiple slices are backed by the same array.
>
> The same as strings.

The same as bytes?

> > Map is the most problematic case. They are (most likely, somebody
> > please confirm this) thread-safe as long as you are not adding
> > anything to the map: concurrently overwriting already existing values
> > is probably safe as long as the keys are different (this is similar to
> > a slice). So this should work:
>
> > m := make(map[string]int)
> > m['a'] = 1
> > m['b'] = 2
> > go func(){ m['a'] = 3 }()
> > go func(){ m['b'] = 4 }()
>
> Just don't run hg update ;)

I run "hg update", but I didn't notice anything map-related.

unread,
Oct 1, 2011, 8:43:13 AM10/1/11
to golang-nuts
On Oct 1, 2:24 pm, Lars Pensjö <lars.pen...@gmail.com> wrote:
> But channels, they are thread-safe, aren't they?

All operations on channels are, obviously, thread-safe.

Assignments are not thread-safe.

Assigning a channel to another variable is *not* an operation on the
channel. It is an operation on the variable. Thus, I disagree with
Dmitry's interpretation.

> The specification of Go doesn't mention the word "thread-safe", which could
> be interpreted as nothing is thread-safe (not even channels).

I think we should first clarify what kind of operations we are talking
about.

Jesse McNelis

unread,
Oct 1, 2011, 8:50:43 AM10/1/11
to golan...@googlegroups.com
On 01/10/11 22:35, ⚛ wrote:
> Also incorrect (?):
>
> type S struct { a byte }
> var s S
> go func() { s.a = 10 }
> go func() { println(s.a) }
>
> ... since the byte might be read/written by accessing individual bits.

huh? The memory model says that anything larger than a machine word
can't be atomicly written or read. So writing to a byte can be atomic.

Writing to a string isn't since a string is more than a single machine
word. (length+pointer)

>>> Slices are thread-safe as long as they refer to non-overlapping memory
>>> regions, even when multiple slices are backed by the same array.
>>
>> The same as strings.
>
> The same as bytes?

Slices are the same as strings, length+pointer too big to be atomicly
written.

>> Just don't run hg update ;)
>
> I run "hg update", but I didn't notice anything map-related.

Somethings in the current implementation work but aren't guaranteed to
work. As long as you never update to a newer version you'll never have
to deal with these kind of changes.


- jessta

Dmitry Vyukov

unread,
Oct 1, 2011, 8:57:30 AM10/1/11
to golan...@googlegroups.com
On Sat, Oct 1, 2011 at 4:24 PM, Lars Pensjö <lars....@gmail.com> wrote:
Thanks Dmitry for the information!

On Saturday, 1 October 2011 12:54:13 UTC+2, Dmitry Vyukov wrote:
On Sat, Oct 1, 2011 at 2:47 PM, Lars Pensjö <lars....@gmail.com> wrote:
[Can parallel threads break data?]
yes 
[...]
yes
[...]
yes
[...]
yes

Hm, I think I get the point...

Using the sync/atomic package, you can do some kind of "atomic" read and write. But what are the rules? That is, if you do atomic.Storeint32(), are you then required to do atomic.Loadint32() to be guaranteed consistency?

Yes.
 

Are the atomic operations implemented using semaphores (implying possible performance issues), or using instructions on the CPU that guarantees the atomic behavior? Maybe you are not supposed to rely on either, as it may be implementation specified.

Currently all atomic operations use CPU atomic instructions (not counting CAS on linux/arm which calls some kernel-provided function with unspecified implementation). But, yes, it's an implementation detail, some operations on some architectures may be implemented by means of mutexes in the future. But in either case they will be not heavier than mutexes, so you may rely that atomics are not slower than mutexes.
 

Dmitry Vyukov

unread,
Oct 1, 2011, 9:01:21 AM10/1/11
to ⚛, golang-nuts
I would prohibit such programs. However, current memory model seems to allow it and even guarantees that it will print either 0 or 10. If such behavior conforms to requirements, then it is a correct program.



> > Slices are thread-safe as long as they refer to non-overlapping memory
> > regions, even when multiple slices are backed by the same array.
>
> The same as strings.

The same as bytes?

No, just undefined behavior.

 

> > Map is the most problematic case. They are (most likely, somebody
> > please confirm this) thread-safe as long as you are not adding
> > anything to the map: concurrently overwriting already existing values
> > is probably safe as long as the keys are different (this is similar to
> > a slice). So this should work:
>
> > m := make(map[string]int)
> > m['a'] = 1
> > m['b'] = 2
> > go func(){ m['a'] = 3 }()
> > go func(){ m['b'] = 4 }()
>
> Just don't run hg update ;)

I run "hg update", but I didn't notice anything map-related.

You were lucky this time, don't run it again :)

unread,
Oct 1, 2011, 9:06:55 AM10/1/11
to golang-nuts
On Oct 1, 3:01 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> On Sat, Oct 1, 2011 at 4:35 PM, ⚛ <0xe2.0x9a.0...@gmail.com> wrote:
> > Also incorrect (?):
>
> > type S struct { a byte }
> > var s S
> > go func() { s.a = 10 }
> > go func() { println(s.a) }
>
> > ... since the byte might be read/written by accessing individual bits.
>
> I would prohibit such programs. However, current memory model seems to allow
> it and even guarantees that it will print either 0 or 10. If such behavior
> conforms to requirements, then it is a correct program.

Where do you see the memory model stating that it guarantees that it
will print either 0 or 10 in this case?

Dmitry Vyukov

unread,
Oct 1, 2011, 9:08:59 AM10/1/11
to ⚛, golang-nuts
On Sat, Oct 1, 2011 at 4:43 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
On Oct 1, 2:24 pm, Lars Pensjö <lars.pen...@gmail.com> wrote:
> But channels, they are thread-safe, aren't they?

All operations on channels are, obviously, thread-safe.

Assignments are not thread-safe.

Assigning a channel to another variable is *not* an operation on the
channel. It is an operation on the variable. Thus, I disagree with
Dmitry's interpretation.

string is the string variable. It's internal buffer is an implementation detail, it's invisible to a user. And the fact that buffers are immutable is an implementation detail as well.
Think of the string as if it is implemented as:
type string struct {
  buf [256]byte
  len int
}
Nothing changes. And thread-safety guarantees become more clear.

Dmitry Vyukov

unread,
Oct 1, 2011, 9:24:36 AM10/1/11
to ⚛, golang-nuts
"a read r may observe the value written by a write w that happens concurrently with r"
It suggests that the read can observe the write and that both are atomic, because it (and the whole doc) talks only about "observing writes", not about "observing partial writes/undefined behavior/etc".
That's what I as a user see there.

unread,
Oct 1, 2011, 9:49:44 AM10/1/11
to golang-nuts
On Oct 1, 3:24 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> > Where do you see the memory model stating that it guarantees that it
> > will print either 0 or 10 in this case?
>
> "a read r may observe the value written by a write w that happens
> concurrently with r"
> It suggests that the read can observe the write and that both are atomic,
> because it (and the whole doc) talks only about "observing writes", not
> about "observing partial writes/undefined behavior/etc".
> That's what I as a user see there.

I do not see it there. The sentence "may observe A or B" is different
from "will observe either A or B". For all we know, "may observe A or
B" includes the possibility that you will observe C.

CPU memory model specifications are stricter. See for example "Intel®
64 Architecture Memory Ordering White Paper".

Dmitry Vyukov

unread,
Oct 1, 2011, 9:53:40 AM10/1/11
to ⚛, golang-nuts
On Sat, Oct 1, 2011 at 5:49 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
On Oct 1, 3:24 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> > Where do you see the memory model stating that it guarantees that it
> > will print either 0 or 10 in this case?
>
> "a read r may observe the value written by a write w that happens
> concurrently with r"
> It suggests that the read can observe the write and that both are atomic,
> because it (and the whole doc) talks only about "observing writes", not
> about "observing partial writes/undefined behavior/etc".
> That's what I as a user see there.

I do not see it there. The sentence "may observe A or B" is different
from "will observe either A or B". For all we know, "may observe A or
B" includes the possibility that you will observe C.

May observe A or B, and since there is no other C that it may observe, it will observe either A or B.

unread,
Oct 1, 2011, 10:31:58 AM10/1/11
to golang-nuts
On Oct 1, 3:53 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> On Sat, Oct 1, 2011 at 5:49 PM, ⚛ <0xe2.0x9a.0...@gmail.com> wrote:
> > On Oct 1, 3:24 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> > > > Where do you see the memory model stating that it guarantees that it
> > > > will print either 0 or 10 in this case?
>
> > > "a read r may observe the value written by a write w that happens
> > > concurrently with r"
> > > It suggests that the read can observe the write and that both are atomic,
> > > because it (and the whole doc) talks only about "observing writes", not
> > > about "observing partial writes/undefined behavior/etc".
> > > That's what I as a user see there.
>
> > I do not see it there. The sentence "may observe A or B" is different
> > from "will observe either A or B". For all we know, "may observe A or
> > B" includes the possibility that you will observe C.
>
> May observe A or B, and since there is no other C that it may observe, it
> will observe either A or B.

Observing C does not contradict the sentence "may observe A or B". The
word "may" means that it is possible to observe A and it is possible
to observe B.

Also, "may" is different from "allowed". There is no explicit mention
of "allowed to observe either A or B" in the Go memory model
specification.

There is no mention of how reads and writes are implemented. For all
we know, it can take 10 years and 123E7 atomic steps to execute a
write, 12 years and 321E5 atomic steps to execute a read.

Dmitry Vyukov

unread,
Oct 1, 2011, 10:39:12 AM10/1/11
to ⚛, golang-nuts
On Sat, Oct 1, 2011 at 6:31 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
On Oct 1, 3:53 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> On Sat, Oct 1, 2011 at 5:49 PM, ⚛ <0xe2.0x9a.0...@gmail.com> wrote:
> > On Oct 1, 3:24 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> > > > Where do you see the memory model stating that it guarantees that it
> > > > will print either 0 or 10 in this case?
>
> > > "a read r may observe the value written by a write w that happens
> > > concurrently with r"
> > > It suggests that the read can observe the write and that both are atomic,
> > > because it (and the whole doc) talks only about "observing writes", not
> > > about "observing partial writes/undefined behavior/etc".
> > > That's what I as a user see there.
>
> > I do not see it there. The sentence "may observe A or B" is different
> > from "will observe either A or B". For all we know, "may observe A or
> > B" includes the possibility that you will observe C.
>
> May observe A or B, and since there is no other C that it may observe, it
> will observe either A or B.

Observing C does not contradict the sentence "may observe A or B". The
word "may" means that it is possible to observe A and it is possible
to observe B.


Yes, but there is no C in the program, there are only A (initialization) and B (the write).

Andrew Hart

unread,
Oct 1, 2011, 11:51:44 AM10/1/11
to golang-nuts
On Oct 1, 7:08 am, Dmitry Vyukov <dvyu...@google.com> wrote:
> string is the string variable. It's internal buffer is an implementation
> detail, it's invisible to a user. And the fact that buffers are immutable is
> an implementation detail as well.
> Think of the string as if it is implemented as:
> type string struct {
>   buf [256]byte
>   len int}
>
> Nothing changes. And thread-safety guarantees become more clear.

Although the internal structure of strings is an "implementation
detail", strings being immutable is required by the spec:
A string type represents the set of string values. Strings behave like
arrays of bytes but are immutable: once created, it is impossible to
change the contents of a string. The predeclared string type is string.

Dmitry Vyukov

unread,
Oct 1, 2011, 12:02:20 PM10/1/11
to Andrew Hart, golang-nuts
It merely says YOU can't change strings, that is, s[i] returns an rvalue.
However, a valid implementation can always copy strings and implement s=s+'a' along the lines of:
s.buf[s.len++] = 'a';

Andrew Hart

unread,
Oct 1, 2011, 12:06:06 PM10/1/11
to golang-nuts
> > > type S struct { a byte }
> > > var s S
> > > go func() { s.a = 10 }
> > > go func() { println(s.a) }

> Where do you see the memory model stating that it guarantees that it
> will print either 0 or 10 in this case?

There is a similar example in the memory model with the comment "If
the channel were buffered (e.g., c = make(chan int, 1)) then the
program would not be guaranteed to print "hello, world". (It might
print the empty string; it cannot print "goodbye, universe", nor can
it crash.)"

Unfortunately, the justification for this is not apparent. It might
even be wrong. See also http://code.google.com/p/go/issues/detail?id=2277

unread,
Oct 2, 2011, 3:49:16 AM10/2/11
to golang-nuts, dvy...@google.com, Russ Cox
On Oct 1, 6:06 pm, Andrew Hart <hartandr...@gmail.com> wrote:
> > > > type S struct { a byte }
> > > > var s S
> > > > go func() { s.a = 10 }
> > > > go func() { println(s.a) }
> > Where do you see the memory model stating that it guarantees that it
> > will print either 0 or 10 in this case?
>
> There is a similar example in the memory model with the comment "If
> the channel were buffered (e.g., c = make(chan int, 1)) then the
> program would not be guaranteed to print "hello, world". (It might
> print the empty string; it cannot print "goodbye, universe", nor can
> it crash.)"

It can crash.

This is an interesting niche case. It shows that current Go
implementation is unsafe. (I presume the app engine is running Go
programs in a sandbox that prevents malicious actions.)

The question is how to create an implementation of the Go language
which is guaranteed not to crash. Considering how strings and slices
are defined in the specification, is such an implementation even
possible (without seriously affecting performance)?

John Meinel

unread,
Oct 2, 2011, 4:48:10 AM10/2/11
to ⚛, golang-nuts, Russ Cox, dvy...@google.com

I'm pretty sure this has been discussed. GAE doesn't allow more than one thread. Thus no multithreading bugs. You can have many goroutines, but only one will ever be active.

I think the "never crash" version was basically to add a single pointer indirection to any type that is larger than one word. So interface ends up as a pointer to the real 2 word interface, etc. Which adds indirection overhead, but probably better than "only one thread allowed".

Not sure about map, though.

John
=:->

Nigel Tao

unread,
Oct 2, 2011, 6:00:48 AM10/2/11
to ⚛, golang-nuts, dvy...@google.com, Russ Cox
On 2 October 2011 18:49, ⚛ <0xe2.0x...@gmail.com> wrote:
> The question is how to create an implementation of the Go language
> which is guaranteed not to crash.

http://research.swtch.com/2010/02/off-to-races.html

Dmitry Vyukov

unread,
Oct 2, 2011, 8:24:59 AM10/2/11
to ⚛, golang-nuts, Russ Cox
On Sun, Oct 2, 2011 at 11:49 AM, ⚛ <0xe2.0x...@gmail.com> wrote:
On Oct 1, 6:06 pm, Andrew Hart <hartandr...@gmail.com> wrote:
> > > > type S struct { a byte }
> > > > var s S
> > > > go func() { s.a = 10 }
> > > > go func() { println(s.a) }
> > Where do you see the memory model stating that it guarantees that it
> > will print either 0 or 10 in this case?
>
> There is a similar example in the memory model with the comment "If
> the channel were buffered (e.g., c = make(chan int, 1)) then the
> program would not be guaranteed to print "hello, world". (It might
> print the empty string; it cannot print "goodbye, universe", nor can
> it crash.)"

It can crash.

This is an interesting niche case. It shows that current Go
implementation is unsafe. (I presume the app engine is running Go
programs in a sandbox that prevents malicious actions.)

The question is how to create an implementation of the Go language
which is guaranteed not to crash. Considering how strings and slices
are defined in the specification, is such an implementation even
possible (without seriously affecting performance)?


 
Why crashing is a problem?
There are 1000 and 1 legal way how a user can crash his program if he wants, e.g.
var p *int = nil
*p = 0
You can't prevent him from doing it. And you do not need to.

The problem is compromising of security. For example, hack interface by means of a data race so that it points to a function that you are not allowed to execute otherwise.

John Asmuth

unread,
Oct 2, 2011, 1:26:05 PM10/2/11
to golan...@googlegroups.com, ⚛


On Saturday, October 1, 2011 7:51:35 AM UTC-4, Dmitry Vyukov wrote:
On Sat, Oct 1, 2011 at 3:38 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
Go strings are fully thread-safe (they are immutable).

Of course they are not. Neither thread-safe nor immutable.
Never ever do this:

type Person struct {
  name string
  ...
}

go func() { person.name = newName }()
go func() { fmt.Printf("%s", person.name) }()

It breaks badly.

Seems to me that it's your code that isn't thread safe - not the string itself. By this argument you can't call any data structure at all threadsafe.

Dmitry Vyukov

unread,
Oct 2, 2011, 1:52:08 PM10/2/11
to golan...@googlegroups.com, ⚛
It is the string. What my program does is: write access the object + read access the object concurrently.
We can rewrite it to make it more clear:
go func() { person.name.Mutate(...) }()
go func() { fmt.Printf("%s", person.name.Read() }()
If the object would be fully thread-safe, it would be OK to call several methods (read/write) concurrently.

Jesse McNelis

unread,
Oct 2, 2011, 2:04:59 PM10/2/11
to golan...@googlegroups.com, John Asmuth
On 03/10/11 04:26, John Asmuth wrote:
> type Person struct {
> name string
> ...
> }
>
> go func() { person.name <http://person.name> = newName }()
> go func() { fmt.Printf("%s", person.name <http://person.name>) }()

>
> It breaks badly.
>
>
> Seems to me that it's your code that isn't thread safe - not the string
> itself. By this argument you can't call any data structure at all
> threadsafe.

The problem is that you can end up with person.name containing a pointer
to an immutable array(for the string contents) and a length from the
previous string, if the scheduler happened to switch threads part way
through the update.
This could result in the printing of the string reading off the end of
the array.

A thread-safe data structure would give you some working value, either
the old or the new not a broken partial of both.

Dmitry Vyukov

unread,
Oct 2, 2011, 2:23:49 PM10/2/11
to Vanja Pejovic, golan...@googlegroups.com, ⚛
On Sun, Oct 2, 2011 at 10:09 PM, Vanja Pejovic <vva...@gmail.com> wrote:
It seems to me that this:
go func() { person.name = newName }()
go func() { fmt.Printf("%s", person.name) }()
is very different from this:
go func() { person.name.Mutate(...) }()
go func() { fmt.Printf("%s", person.name.Read() }()

- In the first version, you are assigning a variable and passing that variable to a function. This is not thread safe since strings are bigger than a word.

It is not safe because it is not safe. We don't know both size of string and size of machine word.

 
- In the second version, you have a variable which presumably already contains an object, and you make two different calls on that object. This is thread safe since strings do not change their state, so regardless of the order of events, both goroutines see the same string.

How I understand it, calling immutable objects inherently thread-safe follows from the fact that once you have created an instance of the object, you can give that instance to as many concurrently running threads as you want, and they will all see the object in a valid state, because it has only one state, the one it was created with.


The following function:

func foo(s *string) {
  println(*s)
  bar()
  println(*s)
}

can output:
foo
bar 

Which makes it obvious that strings are not immutable and do change their state.

Dmitry Vyukov

unread,
Oct 2, 2011, 3:13:24 PM10/2/11
to Vanja Pejovic, golan...@googlegroups.com, ⚛
On Sun, Oct 2, 2011 at 10:45 PM, Vanja Pejovic <vva...@gmail.com> wrote:
I don't have that much experience with Go, but if strings are mutable, then I can see how they are not thread safe. I'm not sure how your example function can possibly result in that output though. I must be missing something.

package main

func foo(s *string) {
  println(*s)
  bar()
  println(*s)
}

var gs string

func bar() {
  gs = "bar"
}

func main() {
  gs = "foo"
  foo(&gs)
}

Strings miss mutating indexing operator (operator[] that returns lvalue), and other common mutation operations (replace substring), but they do have a mutation operation - namely, assignment operator that says "replace my contents with that new string". And that mutation operation is full in the sense that one can build all other mutation operations with it:

func AppendChar(s *string, c byte) {
    *s = string(append([]byte(*s), c))
}

func SetChar(s *string, idx int, c byte) {
    a := []byte(*s)
    a[idx] = c
    *s = string(a)
}

func main() {
    var s string
    AppendChar(&s, 'a')
    AppendChar(&s, 'b')
    AppendChar(&s, 'c')
    println(s)
    SetChar(&s, 1, 'x')
    println(s)
}

outputs:
abc
axc

How does it differ from mutable C++ strings? Semantically it does not. It's just not all that efficient. However allows some other important optimization.

Go strings are perfectly mutable.

Jan Mercl

unread,
Oct 2, 2011, 4:25:58 PM10/2/11
to golan...@googlegroups.com
On Sunday, October 2, 2011 8:23:49 PM UTC+2, Dmitry Vyukov wrote:
The following function:

func foo(s *string) {
  println(*s)
  bar()
  println(*s)
}

can output:
foo
bar 

Which makes it obvious that strings are not immutable and do change their state.

No. What mutates in foo is a pointer to a string. The thing s pointed to on entry to foo is not changed by mutating s, nor by any way of dereferencing it. Strings in Go are immutable. Only the concrete implementation of Go (6g and maybe gccgo too) is racy.

Dmitry Vyukov

unread,
Oct 2, 2011, 4:38:47 PM10/2/11
to golan...@googlegroups.com
On Mon, Oct 3, 2011 at 12:25 AM, Jan Mercl <jan....@nic.cz> wrote:
On Sunday, October 2, 2011 8:23:49 PM UTC+2, Dmitry Vyukov wrote:
The following function:

func foo(s *string) {
  println(*s)
  bar()
  println(*s)
}

can output:
foo
bar 

Which makes it obvious that strings are not immutable and do change their state.

No. What mutates in foo is a pointer to a string.

Are you joking?

func foo(s *string) {
  println(s, *s)
  bar()
  println(s, *s)
}

0x28f50 foo
0x28f50 bar

It is exactly THE SAME string object.
 

The thing s pointed to on entry to foo is not changed by mutating s, nor by any way of dereferencing it. Strings in Go are immutable. Only the concrete implementation of Go (6g and maybe gccgo too) is racy.

How then would you explain that exactly the same string object has different values?

Vanja Pejovic

unread,
Oct 2, 2011, 2:45:31 PM10/2/11
to Dmitry Vyukov, golan...@googlegroups.com, ⚛
I don't have that much experience with Go, but if strings are mutable, then I can see how they are not thread safe. I'm not sure how your example function can possibly result in that output though. I must be missing something.

Vanja Pejovic

unread,
Oct 2, 2011, 2:09:37 PM10/2/11
to Dmitry Vyukov, golan...@googlegroups.com, ⚛
It seems to me that this:
go func() { person.name = newName }()
go func() { fmt.Printf("%s", person.name) }()
is very different from this:
go func() { person.name.Mutate(...) }()
go func() { fmt.Printf("%s", person.name.Read() }()

- In the first version, you are assigning a variable and passing that variable to a function. This is not thread safe since strings are bigger than a word.
- In the second version, you have a variable which presumably already contains an object, and you make two different calls on that object. This is thread safe since strings do not change their state, so regardless of the order of events, both goroutines see the same string.

How I understand it, calling immutable objects inherently thread-safe follows from the fact that once you have created an instance of the object, you can give that instance to as many concurrently running threads as you want, and they will all see the object in a valid state, because it has only one state, the one it was created with.

Ostsol

unread,
Oct 2, 2011, 8:59:08 PM10/2/11
to golan...@googlegroups.com, Dmitry Vyukov, ⚛
On Sunday, 2 October 2011 12:45:31 UTC-6, vvaffle wrote:
I don't have that much experience with Go, but if strings are mutable, then I can see how they are not thread safe. I'm not sure how your example function can possibly result in that output though. I must be missing something.

I think that it's best to say that in the default implementation (6g et al.), the string object is mutable, while the contents of the object are immutable. Complete immutability could only be achieved if assignment after initialization were forbidden. I recall a proposal to that effect for all types (though I forget the programming language referenced). I suppose that if string objects were handled only as pointers, that would also confer immutability. . .

What Dmitry has shown is nothing surprising, given how strings are implemented in Go. If you implement a Go string in C (as a struct containing a char pointer and an integer length) the result is exactly the same. A string assignment is simply a struct-copy operation, after all (in the default implementation). The actual data is not changed, but replaced.

typedef struct {
    char* str;
    l int;
} String;

String new_string(char* str) {
    String ns = {str, strlen(str)};
    return ns;
}

String s;

void bar(void) {
    s = new_string("bar");
}

void foo(s1 *String) {
    printf("%p %s\n", s1, *s1);
    bar();
    printf("%p %s\n", s1, *s1);
}

void main(void) {
    s = new_string("foo");
    foo(&s);
}

The lack of thread-safety comes also from a struct copy not be an atomic operation.

-Daniel (<- hoping to have said something intelligent, today)

Steven Blenkinsop

unread,
Oct 2, 2011, 10:29:29 PM10/2/11
to Dmitry Vyukov, golan...@googlegroups.com
Dmitry, you are arguing a triviality where both answers are right. It's just a matter of perspective.

On Sun, Oct 2, 2011 at 4:38 PM, Dmitry Vyukov <dvy...@google.com> wrote:
It is exactly THE SAME string object.

Depends. It's exactly the same variable, the same memory location. However, the idea of what the "string object" is will often refer to the value, not the memory location. In which case, you've merely replaced the object with another one, rather than mutating a string object.

On Sun, Oct 2, 2011 at 3:13 PM, Dmitry Vyukov <dvy...@google.com> wrote:
Go strings are perfectly mutable.

 From the spec:
Strings behave like arrays of bytes but are immutable: once created, it is impossible to change the contents of a string.

Again, you're clearly arguing with a different perspective on what mutation is than the one contained in the spec. I'll venture that the distinction comes from the difference between value types and reference types.

With reference types, reassignment is clearly different from mutation. Mutating the object affects all variables or expressions that refer to that object, while reassigning the variable does not. With value types, though, the distinction doesn't really exist. Reassignment is mutation, as anything referring to the original elements will be affected by a reassignment.

The problem with strings is that they fall in that fuzzy zone. A mutable value and a mutable reference to an immutable value are semantically equivalent. So, they can be considered from either perspective. Obviously, the spec is considering them as mutable references to immutable values, since that's how it describes them (an immutable byte slice). I'd say I'd prefer to use this description for that reason. You're arguing for talking about them as values, which is fine, but it's not very useful to argue about it.

Anyways, allowing multiple goroutines to access one string value without synchronization is unsafe, but giving them copies of the same string value is safe. Since strings are implemented as reference types (with immutable contents), copying the string value is reasonably efficient. That's all you need to know. Arguing over definitions of mutability and thread safety is a waste of time.

Andrew Hart

unread,
Oct 2, 2011, 11:05:35 PM10/2/11
to golang-nuts
This thread has made me think about slices vs arrays and reference
types vs value types. Thanks :-)

On Oct 2, 8:29 pm, Steven Blenkinsop <steven...@gmail.com> wrote:
> The problem with strings is that they fall in that fuzzy zone. A mutable
> value and a mutable reference to an immutable value are semantically
> equivalent. So, they can be considered from either perspective. Obviously,
> the spec is considering them as mutable references to immutable values,
> since that's how it describes them (an immutable byte slice). I'd say I'd
> prefer to use this description for that reason. You're arguing for talking
> about them as values, which is fine, but it's not very useful to argue about
> it.

Except with strings, the spec isn't clear that strings are reference
types. It says strings are like arrays. Arrays are value types. Slices
are reference types.



The go spec seems misleading when defining strings. Strings look like
slices of unaddressable byte arrays, not like immutable byte arrays.
[1]

As Dmitry has pointed out, strings don't look immutable.

The diagrams in http://research.swtch.com/2009/11/go-data-structures.html
[2] make strings (s and t) look like slices (x and y) not arrays
(bytes).

The slice of an array is a slice; the slice of a slice is a slice.
Like a slice (unlike an array), the slice of a string is the same type
(a string).

Arrays of different lengths are different types. Slices of different
lengths are the same type. There is only one string type.

[1] The definition of addressable might also replace "or slice
indexing operation;" with "or slice indexing operation on a slice of
an addressable array;"
[2] This is, of course, implementation, not spec.

Steven Blenkinsop

unread,
Oct 2, 2011, 11:35:03 PM10/2/11
to Andrew Hart, golang-nuts
On Sun, Oct 2, 2011 at 11:05 PM, Andrew Hart <harta...@gmail.com> wrote:
Except with strings, the spec isn't clear that strings are reference
types. It says strings are like arrays. Arrays are value types. Slices
are reference types.

Right, somehow I misread that. It does say array, which is a bit weird, because of the reasons you detail later. It wouldn't be the first time, though, that the Go authors said "array" when they meant "slice"...

The go spec seems misleading when defining strings. Strings look like
slices of unaddressable byte arrays, not like immutable byte arrays.

I'll quibble with this. You can't slice an unaddressable arra. I'd say they look like slices of immutable byte arrays. The elements of the slice aren't addressable solely because Go's type system cannot represent the type of such an expression (pointer to an immutable byte), though the ability to take the address of the elements would imply reference semantics (so that reassigning with a shorter string couldn't make a pointer point to a non-existent element).

As Dmitry has pointed out, strings don't look immutable.

Again, they look immutable as long as you consider them to be reference types, and mutable if you consider them to be value types. However, since you can't take the address of the elements, there's no way to tell the difference, to see whether a reassignment has mutated the original elements (which would be dangerous, since some of the elements may no longer exist) or just made the the value refer to a new set of elements. By the same token, it doesn't matter. However, as far as implementation goes, we know it has to be implemented using a reference so that the value can have a fixed size regardless of the length of the string.

Anthony Martin

unread,
Oct 2, 2011, 9:53:15 PM10/2/11
to Dmitry Vyukov, golan...@googlegroups.com
Dmitry Vyukov <dvy...@google.com> once said:
> Are you joking?
>
> func foo(s *string) {
> println(s, *s)
> bar()
> println(s, *s)
> }
>
> 0x28f50 foo
> 0x28f50 bar
>
> It is exactly THE SAME string object.

No, it's not. The spec says that the contents of a
string are immutable. A variable can be assigned a
new value but the value cannot be modified.

$ cat >a.go
package main

import "unsafe"

func show(s *string) {
type String struct {
ptr *byte
len int32
}
sp := (*String)(unsafe.Pointer(s))
println(s, "{", sp.ptr, sp.len, *s, "}")
}

func main() {
var s string

s = "foo"
show(&s)

s = "bar"
show(&s)
}
$ 8g a.go && 8l a.8
$ ./8.out
0xb7894fcc { 0x80593b8 3 foo }
0xb7894fcc { 0x8059394 3 bar }
$

The value residing at address 0xb7894fcc in memory
can vary but the values themselves cannot.

I don't understand why there's so much discussion
about this.

1. String values are immutable.
2. String variables are not thread-safe.

It's simple: the memory model does not guarantee
that accessing a variable will be atomic. If this
is important in your application you must use some
kind of synchronization.

Cheers,
Anthony

Andrew Hart

unread,
Oct 3, 2011, 1:33:22 AM10/3/11
to Steven Blenkinsop, golang-nuts
On Sun, Oct 2, 2011 at 9:35 PM, Steven Blenkinsop <stev...@gmail.com> wrote:
The go spec seems misleading when defining strings. Strings look like
slices of unaddressable byte arrays, not like immutable byte arrays.

I'll quibble with this. You can't slice an unaddressable arra. I'd say they look like slices of immutable byte arrays. The elements of the slice aren't addressable solely because Go's type system cannot represent the type of such an expression (pointer to an immutable byte), ...

I like your description of the bytes as not inherently unaddressable, but as a type the address of which the (current) type system cannot represent.

Thinking this way requires that an immutable byte be assignable to a mutable byte.
 
...though the ability to take the address of the elements would imply reference semantics (so that reassigning with a shorter string couldn't make a pointer point to a non-existent element).

That seems to fit well with thinking of strings as slices, rather than arrays.

Jan Mercl

unread,
Oct 3, 2011, 5:34:41 AM10/3/11
to golan...@googlegroups.com
On Sunday, October 2, 2011 10:38:47 PM UTC+2, Dmitry Vyukov wrote:
On Mon, Oct 3, 2011 at 12:25 AM, Jan Mercl <jan....@nic.cz> wrote:
On Sunday, October 2, 2011 8:23:49 PM UTC+2, Dmitry Vyukov wrote:
The following function:

func foo(s *string) {
  println(*s)
  bar()
  println(*s)
}

can output:
foo
bar 

Which makes it obvious that strings are not immutable and do change their state.

No. What mutates in foo is a pointer to a string.

Are you joking?

No, I'm not.
 

func foo(s *string) {
  println(s, *s)
  bar()
  println(s, *s)
}

0x28f50 foo
0x28f50 bar

It is exactly THE SAME string object.

A pointer to an entity is not identical neither equal to the entity it points to. Here 's' is not a string object. Would 's' be not a pointer to string but a string variable/parameter then the thing stored in the variable can change. The underlying original string cannot (see bellow).
 
 
The thing s pointed to on entry to foo is not changed by mutating s, nor by any way of dereferencing it. Strings in Go are immutable. Only the concrete implementation of Go (6g and maybe gccgo too) is racy.

How then would you explain that exactly the same string object has different values?

This might be caused by mixed thinking of C string semantics (string is a *char) and the Go way (string is a value type [with implementation only specific internal reference to the backing byte array]).

I think this program illustrates what I meant: http://goo.gl/D4srl
Please note the last printed value of 't'. I hope it shows one example case of what the string immutability is in Go.

Dmitry Vyukov

unread,
Oct 3, 2011, 5:56:47 AM10/3/11
to Anthony Martin, golan...@googlegroups.com
On Mon, Oct 3, 2011 at 5:53 AM, Anthony Martin <al...@pbrane.org> wrote:
Dmitry Vyukov <dvy...@google.com> once said:
> Are you joking?
>
> func foo(s *string) {
>   println(s, *s)
>   bar()
>   println(s, *s)
> }
>
> 0x28f50 foo
> 0x28f50 bar
>
> It is exactly THE SAME string object.

No, it's not.  The spec says that the contents of a
string are immutable.  A variable can be assigned a
new value but the value cannot be modified.

In what way assignment is not modification?
The only way you can peek into string contents is to print it/examine chars/etc, and what you see there mutates over time -> string contents are mutable. The only thing spec says is that you can't write s[1] = 'a'.

 

$ cat >a.go
package main

import "unsafe"

func show(s *string) {
       type String struct {
               ptr *byte
               len int32
       }
       sp := (*String)(unsafe.Pointer(s))
       println(s, "{", sp.ptr, sp.len, *s, "}")
}

func main() {
       var s string

       s = "foo"
       show(&s)

       s = "bar"
       show(&s)
}
$ 8g a.go && 8l a.8
$ ./8.out
0xb7894fcc { 0x80593b8 3 foo }
0xb7894fcc { 0x8059394 3 bar }
$

The value residing at address 0xb7894fcc in memory
can vary but the values themselves cannot.

Do you mean that there is some Go implementation that uses COW? I can concede that.
In my implementation string is
struct string {
  char buf[255];
  int len;
};
And string assignment do mutate previous value. And I argue that it is a conforming implementation, that is, you can't tell the difference.

Dmitry Vyukov

unread,
Oct 3, 2011, 6:16:57 AM10/3/11
to Steven Blenkinsop, golan...@googlegroups.com
On Mon, Oct 3, 2011 at 6:29 AM, Steven Blenkinsop <stev...@gmail.com> wrote:
Dmitry, you are arguing a triviality where both answers are right. It's just a matter of perspective.

On Sun, Oct 2, 2011 at 4:38 PM, Dmitry Vyukov <dvy...@google.com> wrote:
It is exactly THE SAME string object.

Depends. It's exactly the same variable, the same memory location. However, the idea of what the "string object" is will often refer to the value, not the memory location. In which case, you've merely replaced the object with another one, rather than mutating a string object.

On Sun, Oct 2, 2011 at 3:13 PM, Dmitry Vyukov <dvy...@google.com> wrote:
Go strings are perfectly mutable.

 From the spec:
Strings behave like arrays of bytes but are immutable: once created, it is impossible to change the contents of a string.

Again, you're clearly arguing with a different perspective on what mutation is than the one contained in the spec. I'll venture that the distinction comes from the difference between value types and reference types.

With reference types, reassignment is clearly different from mutation. Mutating the object affects all variables or expressions that refer to that object, while reassigning the variable does not. With value types, though, the distinction doesn't really exist. Reassignment is mutation, as anything referring to the original elements will be affected by a reassignment.

The problem with strings is that they fall in that fuzzy zone. A mutable value and a mutable reference to an immutable value are semantically equivalent. So, they can be considered from either perspective. Obviously, the spec is considering them as mutable references to immutable values, since that's how it describes them (an immutable byte slice). I'd say I'd prefer to use this description for that reason. You're arguing for talking about them as values, which is fine, but it's not very useful to argue about it.

I completely agree. You put things in a good order.
My point was only that user does not care about value vs reference types, mutable value vs mutable reference to an immutable value; all he has is a string value that he can observe with println() and that value is changing over time.
My second point is that
struct string {
  char buf[255];
  int len;
};
is a conforming implementation. Specification is there only to define semantics, not to restrict implementations. So, there is not necessary a pointer/reference to something else in the string al all. String buffer is not necessary immutable. Copying is not necessary cheap. However, of course, taking into account semantics defined and presence of GC, all that is reasonable to assume.

Dmitry Vyukov

unread,
Oct 3, 2011, 6:21:28 AM10/3/11
to Andrew Hart, golang-nuts
On Mon, Oct 3, 2011 at 7:05 AM, Andrew Hart <harta...@gmail.com> wrote:
This thread has made me think about slices vs arrays and reference
types vs value types. Thanks :-)

On Oct 2, 8:29 pm, Steven Blenkinsop <steven...@gmail.com> wrote:
> The problem with strings is that they fall in that fuzzy zone. A mutable
> value and a mutable reference to an immutable value are semantically
> equivalent. So, they can be considered from either perspective. Obviously,
> the spec is considering them as mutable references to immutable values,
> since that's how it describes them (an immutable byte slice). I'd say I'd
> prefer to use this description for that reason. You're arguing for talking
> about them as values, which is fine, but it's not very useful to argue about
> it.

Except with strings, the spec isn't clear that strings are reference
types. It says strings are like arrays. Arrays are value types. Slices
are reference types.



The go spec seems misleading when defining strings. Strings look like
slices of unaddressable byte arrays, not like immutable byte arrays.
[1]

Yeah, but that is irrelevant, in the sense that one can't tell the difference. Moreover, I believe that within a single language implementation some strings can be implemented as satellite data and others as embed data. Some copy operations do deep copy, and others do shallow.

roger peppe

unread,
Oct 3, 2011, 6:21:36 AM10/3/11
to Dmitry Vyukov, Steven Blenkinsop, golan...@googlegroups.com
On 3 October 2011 11:16, Dmitry Vyukov <dvy...@google.com> wrote:
> I completely agree. You put things in a good order.
> My point was only that user does not care about value vs reference types,
> mutable value vs mutable reference to an immutable value; all he has is a
> string value that he can observe with println() and that value is changing
> over time.

only in certain cases (that is, when a reference has been taken).

if i have this function:

func f(s string) {
print(s)
waitForSomething()
print(s)
}

i know that it *must* print the same thing twice.

that's what i'd understand by immutable - it's the
same semantics as an int, for example.

roger peppe

unread,
Oct 3, 2011, 6:22:30 AM10/3/11
to Dmitry Vyukov, Steven Blenkinsop, golan...@googlegroups.com
On 3 October 2011 11:21, roger peppe <rogp...@gmail.com> wrote:
> that's what i'd understand by immutable - it's the
> same semantics as an int, for example.

(atomicity of assignment guarantees aside, of course)

Dmitry Vyukov

unread,
Oct 3, 2011, 6:30:53 AM10/3/11
to golan...@googlegroups.com
On Mon, Oct 3, 2011 at 1:34 PM, Jan Mercl <jan....@nic.cz> wrote:
On Sunday, October 2, 2011 10:38:47 PM UTC+2, Dmitry Vyukov wrote:
On Mon, Oct 3, 2011 at 12:25 AM, Jan Mercl <jan....@nic.cz> wrote:
On Sunday, October 2, 2011 8:23:49 PM UTC+2, Dmitry Vyukov wrote:
The following function:

func foo(s *string) {
  println(*s)
  bar()
  println(*s)
}

can output:
foo
bar 

Which makes it obvious that strings are not immutable and do change their state.

No. What mutates in foo is a pointer to a string.

Are you joking?

No, I'm not.
 

func foo(s *string) {
  println(s, *s)
  bar()
  println(s, *s)
}

0x28f50 foo
0x28f50 bar

It is exactly THE SAME string object.

A pointer to an entity is not identical neither equal to the entity it points to. Here 's' is not a string object.

I do not output 's', I output '*s', and '*s' is string object, and its value is changing over time.

 
Would 's' be not a pointer to string but a string variable/parameter then the thing stored in the variable can change. The underlying original string cannot (see bellow).

I as a user don't know nor care what is "underlying original string".

Dmitry Vyukov

unread,
Oct 3, 2011, 6:35:46 AM10/3/11
to roger peppe, Steven Blenkinsop, golan...@googlegroups.com
That's a very basic property that holds for basically all types - if you have a private object that nobody else have references to and you do not mutate it, then its value does not change. It's not immutability.
Strings are like ints, right. And ints are mutable (i++, or i = i + 1). And you can do the same with strings:
func f(s string) {
  print(s)
  s = s + "a";
  print(s)
}

Jan Mercl

unread,
Oct 3, 2011, 6:42:37 AM10/3/11
to golan...@googlegroups.com
On Monday, October 3, 2011 12:30:53 PM UTC+2, Dmitry Vyukov wrote:
I do not output 's', I output '*s', and '*s' is string object, and its value is changing over time.

This is a string object:
        "foo"

This is a string variable with an initializer, so it holds a string object (in an implementation specific way):
        var s = "foo"

This is a pointer to a string variable:
        var p = &s

Those three things are very/really/completely different. You are observing changes made to a variable (through dereferencing a pointer, though that's not important here). Variables are mutable (that't their essence). String objects are not mutable. Per specs, per previously shown examples. Let's ignore the implementation races which can break the guarantee.
 
I as a user don't know nor care what is "underlying original string".

Actually in lot of places programs do care exactly about that. That's what string immutability, in any programming language supporting it, is for.

Dmitry Vyukov

unread,
Oct 3, 2011, 6:58:33 AM10/3/11
to golan...@googlegroups.com
On Mon, Oct 3, 2011 at 2:42 PM, Jan Mercl <jan....@nic.cz> wrote:
On Monday, October 3, 2011 12:30:53 PM UTC+2, Dmitry Vyukov wrote:
I do not output 's', I output '*s', and '*s' is string object, and its value is changing over time.

This is a string object:
        "foo"

This is a string variable with an initializer, so it holds a string object (in an implementation specific way):
        var s = "foo"

This is a pointer to a string variable:
        var p = &s

The predeclared string type is string.
 

Those three things are very/really/completely different. You are observing changes made to a variable (through dereferencing a pointer, though that's not important here). Variables are mutable (that't their essence). String objects are not mutable. Per specs, per previously shown examples. Let's ignore the implementation races which can break the guarantee.
 
I as a user don't know nor care what is "underlying original string".

Actually in lot of places programs do care exactly about that. That's what string immutability, in any programming language supporting it, is for.

May be yes, maybe not, It's irrelevant.

Just in case: I understand that a reasonable Go implementation implements strings as pointer to immutable buffer. But that does make them immutable. That would mean that strings must be initialized to their final (immutable ) value during construction. That is not the case in Go, where you frequently create an empty string and then *mutate* it (possibly several times) to contain a desired value.


Jan Mercl

unread,
Oct 3, 2011, 7:18:17 AM10/3/11
to golan...@googlegroups.com
On Monday, October 3, 2011 12:58:33 PM UTC+2, Dmitry Vyukov wrote:
On Mon, Oct 3, 2011 at 2:42 PM, Jan Mercl <jan....@nic.cz> wrote:
On Monday, October 3, 2011 12:30:53 PM UTC+2, Dmitry Vyukov wrote:
I do not output 's', I output '*s', and '*s' is string object, and its value is changing over time.

This is a string object:
        "foo"

This is a string variable with an initializer, so it holds a string object (in an implementation specific way):
        var s = "foo"

This is a pointer to a string variable:
        var p = &s

The predeclared string type is string.

Types are yet another topic, but let me follow: Then a pointer to string (which you call for reasons not known to me a 'string object") is not the same type as string is and so the string immutability guarantee has nothing to do with it, right?
 
 

Those three things are very/really/completely different. You are observing changes made to a variable (through dereferencing a pointer, though that's not important here). Variables are mutable (that't their essence). String objects are not mutable. Per specs, per previously shown examples. Let's ignore the implementation races which can break the guarantee.
 
I as a user don't know nor care what is "underlying original string".

Actually in lot of places programs do care exactly about that. That's what string immutability, in any programming language supporting it, is for.

May be yes, maybe not, It's irrelevant.

Just in case: I understand that a reasonable Go implementation implements strings as pointer to immutable buffer. But that does make them immutable.

Go strings (objects, not to be confused with string variables) are immutable. Another way to think about the problem is for example: regardless of the Go implementation, the string value/object is just "somewhere" in the computer's memory. That very memory cannot be, once constructed, written to by a Go program (sans "usnafe") as long as that value is reachable. A string variable, in contrast, can freely change its value, e.g. by "somehow" having the values of different string objects at different times.

That would mean that strings must be initialized to their final (immutable ) value during construction. That is not the case in Go,

It's exactly this case in Go.
 
where you frequently create an empty string and then *mutate* it (possibly several times) to contain a desired value.

The variables are changing, not the string objects.

Dmitry Vyukov

unread,
Oct 3, 2011, 7:23:09 AM10/3/11
to golan...@googlegroups.com
On Mon, Oct 3, 2011 at 3:18 PM, Jan Mercl <jan....@nic.cz> wrote:
On Monday, October 3, 2011 12:58:33 PM UTC+2, Dmitry Vyukov wrote:
On Mon, Oct 3, 2011 at 2:42 PM, Jan Mercl <jan....@nic.cz> wrote:
On Monday, October 3, 2011 12:30:53 PM UTC+2, Dmitry Vyukov wrote:
I do not output 's', I output '*s', and '*s' is string object, and its value is changing over time.

This is a string object:
        "foo"

This is a string variable with an initializer, so it holds a string object (in an implementation specific way):
        var s = "foo"

This is a pointer to a string variable:
        var p = &s

The predeclared string type is string.

Types are yet another topic, but let me follow: Then a pointer to string (which you call for reasons not known to me a 'string object") is not the same type as string is and so the string immutability guarantee has nothing to do with it, right?

I do not output 's', I output '*s', and '*s' is exactly the same string object, and its value is changing over time.

 
 
 

Those three things are very/really/completely different. You are observing changes made to a variable (through dereferencing a pointer, though that's not important here). Variables are mutable (that't their essence). String objects are not mutable. Per specs, per previously shown examples. Let's ignore the implementation races which can break the guarantee.
 
I as a user don't know nor care what is "underlying original string".

Actually in lot of places programs do care exactly about that. That's what string immutability, in any programming language supporting it, is for.

May be yes, maybe not, It's irrelevant.

Just in case: I understand that a reasonable Go implementation implements strings as pointer to immutable buffer. But that does make them immutable.

Go strings (objects, not to be confused with string variables) are immutable. Another way to think about the problem is for example: regardless of the Go implementation, the string value/object is just "somewhere" in the computer's memory. That very memory cannot be, once constructed, written to by a Go program (sans "usnafe") as long as that value is reachable.

We don't know. It perfectly can be mutated w/o violating the language spec.

struct string {
  char buf[255];
  int len;
};
is a conforming implementation. Specification is there only to define semantics, not to restrict implementations. So, there is not necessary a pointer/reference to something else in the string al all. String buffer is not necessary immutable.

 
A string variable, in contrast, can freely change its value, e.g. by "somehow" having the values of different string objects at different times.

Jan Mercl

unread,
Oct 3, 2011, 7:53:25 AM10/3/11
to golan...@googlegroups.com
On Monday, October 3, 2011 1:23:09 PM UTC+2, Dmitry Vyukov wrote:
I do not output 's', I output '*s', and '*s' is exactly the same string object, and its value is changing over time.

*s is not a string object. *s is a mutable variable which can "hold" string objects (note: you can't take an address of a string constant, so *string is guaranteed to be pointing to a string variable/field and never to a string [memory] "object" per se). Those string objects are immutable. I'm not an authority, the specs are - I'm just citing them (loosely ;-)
 

We don't know. It perfectly can be mutated w/o violating the language spec.

struct string {
  char buf[255];
  int len;
};
is a conforming implementation. Specification is there only to define semantics, not to restrict implementations. So, there is not necessary a pointer/reference to something else in the string al all. String buffer is not necessary immutable.

That's not correct. Conforming implementation must (in the above implementation case example) guarantee that the char field (of the above struct) is not mutable (after the struct is created/initialized). And that's the case of Go implementation (although the implementation details are different - [the immutable] semantics are the important thing here).

Dmitry Vyukov

unread,
Oct 3, 2011, 7:57:33 AM10/3/11
to golan...@googlegroups.com
On Mon, Oct 3, 2011 at 3:53 PM, Jan Mercl <jan....@nic.cz> wrote:
On Monday, October 3, 2011 1:23:09 PM UTC+2, Dmitry Vyukov wrote:
I do not output 's', I output '*s', and '*s' is exactly the same string object, and its value is changing over time.

*s is not a string object. *s is a mutable variable which can "hold" string objects (note: you can't take an address of a string constant, so *string is guaranteed to be pointing to a string variable/field and never to a string [memory] "object" per se). Those string objects are immutable. I'm not an authority, the specs are - I'm just citing them (loosely ;-)
 

We don't know. It perfectly can be mutated w/o violating the language spec.

struct string {
  char buf[255];
  int len;
};
is a conforming implementation. Specification is there only to define semantics, not to restrict implementations. So, there is not necessary a pointer/reference to something else in the string al all. String buffer is not necessary immutable.

That's not correct. Conforming implementation must (in the above implementation case example) guarantee that the char field (of the above struct) is not mutable (after the struct is created/initialized).

No, it does not have to, because you won't be able to tell the difference.
 
And that's the case of Go implementation (although the implementation details are different - [the immutable] semantics are the important thing here).

Indeed. Semantics of my string implementation are the same. You can't tell the difference.

Jan Mercl

unread,
Oct 3, 2011, 8:11:06 AM10/3/11
to golan...@googlegroups.com
On Monday, October 3, 2011 1:57:33 PM UTC+2, Dmitry Vyukov wrote:
That's not correct. Conforming implementation must (in the above implementation case example) guarantee that the char field (of the above struct) is not mutable (after the struct is created/initialized).

No, it does not have to, because you won't be able to tell the difference.
 
And that's the case of Go implementation (although the implementation details are different - [the immutable] semantics are the important thing here).

Indeed. Semantics of my string implementation are the same. You can't tell the difference.

Well, I don't agree. As seen above, I terribly failed to successfully explain the semantics of immutability of strings in Go (or more correctly said - how I understand them). That's my fault - sorry. I hope someone from the Go authors can try better ;-)

chris dollin

unread,
Oct 3, 2011, 8:34:34 AM10/3/11
to golan...@googlegroups.com
I don't understand why there's all this dispute about the immutability
of strings, unless it's perhaps some slightly sloppy wording in the spec.

string /variables/ are mutable; you can freely change their value, as the
term "variable" hints.

string /values/ are immutable; you cannot update the components of
a string value, ie

"hello"[2] = 'p'
x := "chat"; x[3] = 't'

are both illegal, while a subsequent

x = "chad"

is legal.

/Because/ strings are immutable, implementations can share storage
for the component bytes comprising the spelling of the string, rather than
having to make a fresh (possibly large) copy:

y := "some enormous string a long way down to the chemists."
ylette := y[10:20]

The Go implementation has both y and ylette referring to "the same"
array of bytes. Since one cannot update the bytes, updating inside y
can't change the value of ylette.

Hence Dmitry's "you can't tell the difference" [1].

(Of course this means that a one-character slice of a multi-megabyte
string might end up holding onto the whole big thing, depending on
interesting details of the garbage collector.)

I hope any sloppyness in my wording destructively interferes with
any in the spec.

Chris

[1] From inside safe code and assuming loadsa memory, no, more
than that.

--
Chris "for constructive effect" Dollin

roger peppe

unread,
Oct 3, 2011, 8:45:46 AM10/3/11
to golang-nuts
[oops, I forgot to Reply-All to this earlier, sending it to Dmitry alone]

On 3 October 2011 11:35, Dmitry Vyukov <dvy...@google.com> wrote:
> That's a very basic property that holds for basically all types - if you
> have a private object that nobody else have references to and you do not
> mutate it, then its value does not change. It's not immutability.

it doesn't hold for maps or slices, both of which i'd consider "mutable".

it depends whether we're talking about mutability of the *type*
or mutability of the *variable*. all variables in Go are mutable - strings
are no different - but only some types are mutable.

perhaps that distinction is the source of what seems like an argument at cross
purposes here.

unread,
Oct 3, 2011, 12:14:19 PM10/3/11
to golang-nuts
Well, this discussion seems to be getting out of hands. Maybe, a
Platonic argument will be useful here.

I would suggest we consider the following question: What is the
address of the integer number 123 in Go?

The answer is: 123 does not have an address which would allow you to
modify the number 123. You cannot modify 123. You cannot take the
address of the number 123 and change 123 into something else. You are
only allowed to take the address of a memory location currently
holding 123 and put a different number in that memory location. If you
do not have address of X which would allow you to modify X, you cannot
modify X.

The same logic extends to Go strings: What is the address that you can
use to *modify* the string "abc" in Go? The answer is: there is no
such address. Thus, since there is no address which would grant you
modification rights, all Go strings are immutable.

I will repeat the argument: Since there is no address through which to
modify the string "abc" in a Go program, the string is BY DEFINITION
immutable. What is mutable then? Variables are mutable. Any variable
that you can modify in a Go program has to have a name or an address
through which you can change the variable.

The mathematical object 123 does not have any address which would
allow you to change it. Just think about it: if you knew how to change
123 into something different, it would mean that after the change all
computations involving 123 would start yielding different results than
what we are used to. I believe such a change is impossible.

The fact that under current implementation (8g/6g/5g), certain read-
write patterns in concurrent programs can lead to unintended states
and crash the Go program, has nothing to do with strings. It has to do
with variables.

What you seem to be suggesting throughout this discussion is that you
happen to know a method which would allow you to modify Go strings.
You have multiple times written that Go strings are mutable. Ok, maybe
they are mutable and you are right about it - but if Go strings are
mutable it implies that you also happen to know a method of how to
modify the number 123 itself.

I really would like to know how to change 123 into something else.

On Oct 3, 1:57 pm, Dmitry Vyukov <dvyu...@google.com> wrote:

Jonathan Amsterdam

unread,
Oct 3, 2011, 1:45:01 PM10/3/11
to golang-nuts
Dmitry is simply pointing out that "type T has value semantics" and
"type T has reference semantics and is immutable" are semantically
indistinguishable. He's certainly right about that, but he's making
his point in a terribly confusing way.

And programmers care about performance as well as semantics, so they
would really like to know whether T is implemented as a value (i.e.
directly, as in Dmitry's alternative implementation of string) or by a
reference to immutable data. C++ strings are values and Java strings
are immutable references, and this matters to the performance of a
call to void f(string s). In C++ that declaration is practically a
bug, sufficient to prevent me from hiring its author; in Java, it's
fine.

The spec, in saying strings are immutable, is speaking to the
practical programmer, not the semanticist. It is in effect saying,
"strings have value semantics, but they're implemented as references
to immutable data, so it's cheap to assign, pass and return them."
Here as elsewhere, the spec relies on the good sense and charity of
the reader. It is not a formal document, and does not aspire to be one.

unread,
Oct 3, 2011, 2:38:47 PM10/3/11
to golang-nuts
On Oct 3, 7:45 pm, Jonathan Amsterdam <j...@google.com> wrote:
> C++ strings are values and Java strings
> are immutable references, and this matters to the performance of a
> call to void f(string s). In C++ that declaration is practically a
> bug, sufficient to prevent me from hiring its author;

I think it isn't sufficient.

Robert Bloomquist

unread,
Oct 3, 2011, 6:52:12 PM10/3/11
to golan...@googlegroups.com
package main

func main() {
     s := "foo"
     println(&s, s)
     st := s[:]
     s = "bar"
     println(st, &s, s)
}

Output is:
0x2084f98 foo
foo 0x2084f98 bar

Hopefully this program clears up some confusion.

String values contain an integer length and a pointer to an underlying data array.  When you reassign a string, a new array is initialized, and the pointer in the string struct is changed.  Other references to the original array are unchanged.

The address of the string is not the same as the address in the underlying array. That's why in Dmitry's example, the address did not change, even though the value did.

Christoffer Hallas

unread,
Oct 3, 2011, 8:15:35 PM10/3/11
to golan...@googlegroups.com
I guess the runtime representation of both Slice and String are quite good candidates of how it looks, at least on 6g and 8g.

Reply all
Reply to author
Forward
0 new messages