I know that maps are not thread safe. But how unsafe are they?
Can a map get into an illegal state where it can't be used?
Even worse, can the memory allocation get into illegal state?
Can a pointer be partly updated, leading to illegal pointers?
This is also a general question on many of the language functionality and runtime library. For a runtime library, I expect that a package can be broken (getting an illegal state) from parallel threads, unless explicitly stated otherwise. The reason I am asking is that you can sometimes skip the use of semaphores for an object if the algorithm can handle random asynchronous changes of that object.
But that would only be possible if the basic programming language mechanisms at least keep internal "state safety".There is the sync/atomic package, which seems to indicate that almost anything can indeed be broken from parallel threads.
On Oct 1, 12:54 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> On Sat, Oct 1, 2011 at 2:47 PM, Lars Pensjö <lars.pen...@gmail.com> wrote:Go strings are fully thread-safe (they are immutable).
> > This is also a general question on many of the language functionality and
> > runtime library. For a runtime library, I expect that a package can be
> > broken (getting an illegal state) from parallel threads, unless explicitly
> > stated otherwise. The reason I am asking is that you can sometimes skip the
> > use of semaphores for an object if the algorithm can handle random
> > asynchronous changes of that object.
>
> Such objects called thread-safe, map/slice/string are not.
Slices are thread-safe as long as they refer to non-overlapping memory
regions, even when multiple slices are backed by the same array.
Map is the most problematic case. They are (most likely, somebody
please confirm this) thread-safe as long as you are not adding
anything to the map: concurrently overwriting already existing values
is probably safe as long as the keys are different (this is similar to
a slice). So this should work:
m := make(map[string]int)
m['a'] = 1
m['b'] = 2
go func(){ m['a'] = 3 }()
go func(){ m['b'] = 4 }()
On Sat, Oct 1, 2011 at 2:47 PM, Lars Pensjö <lars....@gmail.com> wrote:
[Can parallel threads break data?]yes[...]yes[...]yes[...]yes
huh? The memory model says that anything larger than a machine word
can't be atomicly written or read. So writing to a byte can be atomic.
Writing to a string isn't since a string is more than a single machine
word. (length+pointer)
>>> Slices are thread-safe as long as they refer to non-overlapping memory
>>> regions, even when multiple slices are backed by the same array.
>>
>> The same as strings.
>
> The same as bytes?
Slices are the same as strings, length+pointer too big to be atomicly
written.
>> Just don't run hg update ;)
>
> I run "hg update", but I didn't notice anything map-related.
Somethings in the current implementation work but aren't guaranteed to
work. As long as you never update to a newer version you'll never have
to deal with these kind of changes.
- jessta
Thanks Dmitry for the information!
On Saturday, 1 October 2011 12:54:13 UTC+2, Dmitry Vyukov wrote:On Sat, Oct 1, 2011 at 2:47 PM, Lars Pensjö <lars....@gmail.com> wrote:
[Can parallel threads break data?]yes[...]yes[...]yes[...]yesHm, I think I get the point...Using the sync/atomic package, you can do some kind of "atomic" read and write. But what are the rules? That is, if you do atomic.Storeint32(), are you then required to do atomic.Loadint32() to be guaranteed consistency?
Are the atomic operations implemented using semaphores (implying possible performance issues), or using instructions on the CPU that guarantees the atomic behavior? Maybe you are not supposed to rely on either, as it may be implementation specified.
The same as bytes?
> > Slices are thread-safe as long as they refer to non-overlapping memory
> > regions, even when multiple slices are backed by the same array.
>
> The same as strings.
I run "hg update", but I didn't notice anything map-related.
> > Map is the most problematic case. They are (most likely, somebody
> > please confirm this) thread-safe as long as you are not adding
> > anything to the map: concurrently overwriting already existing values
> > is probably safe as long as the keys are different (this is similar to
> > a slice). So this should work:
>
> > m := make(map[string]int)
> > m['a'] = 1
> > m['b'] = 2
> > go func(){ m['a'] = 3 }()
> > go func(){ m['b'] = 4 }()
>
> Just don't run hg update ;)
On Oct 1, 2:24 pm, Lars Pensjö <lars.pen...@gmail.com> wrote:All operations on channels are, obviously, thread-safe.
> But channels, they are thread-safe, aren't they?
Assignments are not thread-safe.
Assigning a channel to another variable is *not* an operation on the
channel. It is an operation on the variable. Thus, I disagree with
Dmitry's interpretation.
On Oct 1, 3:24 pm, Dmitry Vyukov <dvyu...@google.com> wrote:I do not see it there. The sentence "may observe A or B" is different
> > Where do you see the memory model stating that it guarantees that it
> > will print either 0 or 10 in this case?
>
> "a read r may observe the value written by a write w that happens
> concurrently with r"
> It suggests that the read can observe the write and that both are atomic,
> because it (and the whole doc) talks only about "observing writes", not
> about "observing partial writes/undefined behavior/etc".
> That's what I as a user see there.
from "will observe either A or B". For all we know, "may observe A or
B" includes the possibility that you will observe C.
On Oct 1, 3:53 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> On Sat, Oct 1, 2011 at 5:49 PM, ⚛ <0xe2.0x9a.0...@gmail.com> wrote:Observing C does not contradict the sentence "may observe A or B". The
> > On Oct 1, 3:24 pm, Dmitry Vyukov <dvyu...@google.com> wrote:
> > > > Where do you see the memory model stating that it guarantees that it
> > > > will print either 0 or 10 in this case?
>
> > > "a read r may observe the value written by a write w that happens
> > > concurrently with r"
> > > It suggests that the read can observe the write and that both are atomic,
> > > because it (and the whole doc) talks only about "observing writes", not
> > > about "observing partial writes/undefined behavior/etc".
> > > That's what I as a user see there.
>
> > I do not see it there. The sentence "may observe A or B" is different
> > from "will observe either A or B". For all we know, "may observe A or
> > B" includes the possibility that you will observe C.
>
> May observe A or B, and since there is no other C that it may observe, it
> will observe either A or B.
word "may" means that it is possible to observe A and it is possible
to observe B.
I'm pretty sure this has been discussed. GAE doesn't allow more than one thread. Thus no multithreading bugs. You can have many goroutines, but only one will ever be active.
I think the "never crash" version was basically to add a single pointer indirection to any type that is larger than one word. So interface ends up as a pointer to the real 2 word interface, etc. Which adds indirection overhead, but probably better than "only one thread allowed".
Not sure about map, though.
John
=:->
On Oct 1, 6:06 pm, Andrew Hart <hartandr...@gmail.com> wrote:It can crash.
> > > > type S struct { a byte }
> > > > var s S
> > > > go func() { s.a = 10 }
> > > > go func() { println(s.a) }
> > Where do you see the memory model stating that it guarantees that it
> > will print either 0 or 10 in this case?
>
> There is a similar example in the memory model with the comment "If
> the channel were buffered (e.g., c = make(chan int, 1)) then the
> program would not be guaranteed to print "hello, world". (It might
> print the empty string; it cannot print "goodbye, universe", nor can
> it crash.)"
This is an interesting niche case. It shows that current Go
implementation is unsafe. (I presume the app engine is running Go
programs in a sandbox that prevents malicious actions.)
The question is how to create an implementation of the Go language
which is guaranteed not to crash. Considering how strings and slices
are defined in the specification, is such an implementation even
possible (without seriously affecting performance)?
On Sat, Oct 1, 2011 at 3:38 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
Go strings are fully thread-safe (they are immutable).Of course they are not. Neither thread-safe nor immutable.Never ever do this:type Person struct {name string...
}go func() { person.name = newName }()go func() { fmt.Printf("%s", person.name) }()It breaks badly.
The problem is that you can end up with person.name containing a pointer
to an immutable array(for the string contents) and a length from the
previous string, if the scheduler happened to switch threads part way
through the update.
This could result in the printing of the string reading off the end of
the array.
A thread-safe data structure would give you some working value, either
the old or the new not a broken partial of both.
It seems to me that this:
go func() { person.name = newName }()go func() { fmt.Printf("%s", person.name) }()
is very different from this:
go func() { person.name.Mutate(...) }()go func() { fmt.Printf("%s", person.name.Read() }()
- In the first version, you are assigning a variable and passing that variable to a function. This is not thread safe since strings are bigger than a word.
- In the second version, you have a variable which presumably already contains an object, and you make two different calls on that object. This is thread safe since strings do not change their state, so regardless of the order of events, both goroutines see the same string.How I understand it, calling immutable objects inherently thread-safe follows from the fact that once you have created an instance of the object, you can give that instance to as many concurrently running threads as you want, and they will all see the object in a valid state, because it has only one state, the one it was created with.
I don't have that much experience with Go, but if strings are mutable, then I can see how they are not thread safe. I'm not sure how your example function can possibly result in that output though. I must be missing something.
The following function:func foo(s *string) {println(*s)bar()
println(*s)
}can output:foobarWhich makes it obvious that strings are not immutable and do change their state.
On Sunday, October 2, 2011 8:23:49 PM UTC+2, Dmitry Vyukov wrote:The following function:func foo(s *string) {println(*s)bar()
println(*s)
}can output:foobarWhich makes it obvious that strings are not immutable and do change their state.No. What mutates in foo is a pointer to a string.
The thing s pointed to on entry to foo is not changed by mutating s, nor by any way of dereferencing it. Strings in Go are immutable. Only the concrete implementation of Go (6g and maybe gccgo too) is racy.
I don't have that much experience with Go, but if strings are mutable, then I can see how they are not thread safe. I'm not sure how your example function can possibly result in that output though. I must be missing something.
It is exactly THE SAME string object.
Go strings are perfectly mutable.
Strings behave like arrays of bytes but are immutable: once created, it is impossible to change the contents of a string.
Except with strings, the spec isn't clear that strings are referencetypes. It says strings are like arrays. Arrays are value types. Slices
are reference types.
The go spec seems misleading when defining strings. Strings look like
slices of unaddressable byte arrays, not like immutable byte arrays.
As Dmitry has pointed out, strings don't look immutable.
No, it's not. The spec says that the contents of a
string are immutable. A variable can be assigned a
new value but the value cannot be modified.
$ cat >a.go
package main
import "unsafe"
func show(s *string) {
type String struct {
ptr *byte
len int32
}
sp := (*String)(unsafe.Pointer(s))
println(s, "{", sp.ptr, sp.len, *s, "}")
}
func main() {
var s string
s = "foo"
show(&s)
s = "bar"
show(&s)
}
$ 8g a.go && 8l a.8
$ ./8.out
0xb7894fcc { 0x80593b8 3 foo }
0xb7894fcc { 0x8059394 3 bar }
$
The value residing at address 0xb7894fcc in memory
can vary but the values themselves cannot.
I don't understand why there's so much discussion
about this.
1. String values are immutable.
2. String variables are not thread-safe.
It's simple: the memory model does not guarantee
that accessing a variable will be atomic. If this
is important in your application you must use some
kind of synchronization.
Cheers,
Anthony
The go spec seems misleading when defining strings. Strings look like
slices of unaddressable byte arrays, not like immutable byte arrays.
I'll quibble with this. You can't slice an unaddressable arra. I'd say they look like slices of immutable byte arrays. The elements of the slice aren't addressable solely because Go's type system cannot represent the type of such an expression (pointer to an immutable byte), ...
...though the ability to take the address of the elements would imply reference semantics (so that reassigning with a shorter string couldn't make a pointer point to a non-existent element).
On Mon, Oct 3, 2011 at 12:25 AM, Jan Mercl <jan....@nic.cz> wrote:On Sunday, October 2, 2011 8:23:49 PM UTC+2, Dmitry Vyukov wrote:The following function:func foo(s *string) {println(*s)bar()
println(*s)
}can output:foobarWhich makes it obvious that strings are not immutable and do change their state.No. What mutates in foo is a pointer to a string.Are you joking?
func foo(s *string) {
println(s, *s)
bar()
println(s, *s)
}
0x28f50 foo
0x28f50 barIt is exactly THE SAME string object.
The thing s pointed to on entry to foo is not changed by mutating s, nor by any way of dereferencing it. Strings in Go are immutable. Only the concrete implementation of Go (6g and maybe gccgo too) is racy.How then would you explain that exactly the same string object has different values?
Dmitry Vyukov <dvy...@google.com> once said:No, it's not. The spec says that the contents of a
> Are you joking?
>
> func foo(s *string) {
> println(s, *s)
> bar()
> println(s, *s)
> }
>
> 0x28f50 foo
> 0x28f50 bar
>
> It is exactly THE SAME string object.
string are immutable. A variable can be assigned a
new value but the value cannot be modified.
$ cat >a.go
package main
import "unsafe"
func show(s *string) {
type String struct {
ptr *byte
len int32
}
sp := (*String)(unsafe.Pointer(s))
println(s, "{", sp.ptr, sp.len, *s, "}")
}s = "foo"
func main() {
var s string
show(&s)
s = "bar"
show(&s)
}
$ 8g a.go && 8l a.8
$ ./8.out
0xb7894fcc { 0x80593b8 3 foo }
0xb7894fcc { 0x8059394 3 bar }
$
The value residing at address 0xb7894fcc in memory
can vary but the values themselves cannot.
Dmitry, you are arguing a triviality where both answers are right. It's just a matter of perspective.It is exactly THE SAME string object.Depends. It's exactly the same variable, the same memory location. However, the idea of what the "string object" is will often refer to the value, not the memory location. In which case, you've merely replaced the object with another one, rather than mutating a string object.On Sun, Oct 2, 2011 at 3:13 PM, Dmitry Vyukov <dvy...@google.com> wrote:Go strings are perfectly mutable.From the spec:Strings behave like arrays of bytes but are immutable: once created, it is impossible to change the contents of a string.Again, you're clearly arguing with a different perspective on what mutation is than the one contained in the spec. I'll venture that the distinction comes from the difference between value types and reference types.With reference types, reassignment is clearly different from mutation. Mutating the object affects all variables or expressions that refer to that object, while reassigning the variable does not. With value types, though, the distinction doesn't really exist. Reassignment is mutation, as anything referring to the original elements will be affected by a reassignment.The problem with strings is that they fall in that fuzzy zone. A mutable value and a mutable reference to an immutable value are semantically equivalent. So, they can be considered from either perspective. Obviously, the spec is considering them as mutable references to immutable values, since that's how it describes them (an immutable byte slice). I'd say I'd prefer to use this description for that reason. You're arguing for talking about them as values, which is fine, but it's not very useful to argue about it.
This thread has made me think about slices vs arrays and reference
types vs value types. Thanks :-)
Except with strings, the spec isn't clear that strings are reference
On Oct 2, 8:29 pm, Steven Blenkinsop <steven...@gmail.com> wrote:
> The problem with strings is that they fall in that fuzzy zone. A mutable
> value and a mutable reference to an immutable value are semantically
> equivalent. So, they can be considered from either perspective. Obviously,
> the spec is considering them as mutable references to immutable values,
> since that's how it describes them (an immutable byte slice). I'd say I'd
> prefer to use this description for that reason. You're arguing for talking
> about them as values, which is fine, but it's not very useful to argue about
> it.
types. It says strings are like arrays. Arrays are value types. Slices
are reference types.
The go spec seems misleading when defining strings. Strings look like
slices of unaddressable byte arrays, not like immutable byte arrays.
[1]
only in certain cases (that is, when a reference has been taken).
if i have this function:
func f(s string) {
print(s)
waitForSomething()
print(s)
}
i know that it *must* print the same thing twice.
that's what i'd understand by immutable - it's the
same semantics as an int, for example.
(atomicity of assignment guarantees aside, of course)
On Sunday, October 2, 2011 10:38:47 PM UTC+2, Dmitry Vyukov wrote:On Mon, Oct 3, 2011 at 12:25 AM, Jan Mercl <jan....@nic.cz> wrote:
On Sunday, October 2, 2011 8:23:49 PM UTC+2, Dmitry Vyukov wrote:The following function:func foo(s *string) {println(*s)bar()
println(*s)
}can output:foobarWhich makes it obvious that strings are not immutable and do change their state.No. What mutates in foo is a pointer to a string.Are you joking?No, I'm not.
func foo(s *string) {
println(s, *s)
bar()
println(s, *s)
}
0x28f50 foo
0x28f50 barIt is exactly THE SAME string object.A pointer to an entity is not identical neither equal to the entity it points to. Here 's' is not a string object.
Would 's' be not a pointer to string but a string variable/parameter then the thing stored in the variable can change. The underlying original string cannot (see bellow).
I do not output 's', I output '*s', and '*s' is string object, and its value is changing over time.
I as a user don't know nor care what is "underlying original string".
On Monday, October 3, 2011 12:30:53 PM UTC+2, Dmitry Vyukov wrote:I do not output 's', I output '*s', and '*s' is string object, and its value is changing over time.This is a string object:"foo"This is a string variable with an initializer, so it holds a string object (in an implementation specific way):var s = "foo"This is a pointer to a string variable:var p = &s
string
.Those three things are very/really/completely different. You are observing changes made to a variable (through dereferencing a pointer, though that's not important here). Variables are mutable (that't their essence). String objects are not mutable. Per specs, per previously shown examples. Let's ignore the implementation races which can break the guarantee.I as a user don't know nor care what is "underlying original string".Actually in lot of places programs do care exactly about that. That's what string immutability, in any programming language supporting it, is for.
On Mon, Oct 3, 2011 at 2:42 PM, Jan Mercl <jan....@nic.cz> wrote:On Monday, October 3, 2011 12:30:53 PM UTC+2, Dmitry Vyukov wrote:I do not output 's', I output '*s', and '*s' is string object, and its value is changing over time.This is a string object:"foo"This is a string variable with an initializer, so it holds a string object (in an implementation specific way):var s = "foo"This is a pointer to a string variable:var p = &sThe predeclared string type isstring
.
Those three things are very/really/completely different. You are observing changes made to a variable (through dereferencing a pointer, though that's not important here). Variables are mutable (that't their essence). String objects are not mutable. Per specs, per previously shown examples. Let's ignore the implementation races which can break the guarantee.I as a user don't know nor care what is "underlying original string".Actually in lot of places programs do care exactly about that. That's what string immutability, in any programming language supporting it, is for.
May be yes, maybe not, It's irrelevant.Just in case: I understand that a reasonable Go implementation implements strings as pointer to immutable buffer. But that does make them immutable.
That would mean that strings must be initialized to their final (immutable ) value during construction. That is not the case in Go,
where you frequently create an empty string and then *mutate* it (possibly several times) to contain a desired value.
On Monday, October 3, 2011 12:58:33 PM UTC+2, Dmitry Vyukov wrote:On Mon, Oct 3, 2011 at 2:42 PM, Jan Mercl <jan....@nic.cz> wrote:
On Monday, October 3, 2011 12:30:53 PM UTC+2, Dmitry Vyukov wrote:I do not output 's', I output '*s', and '*s' is string object, and its value is changing over time.This is a string object:"foo"This is a string variable with an initializer, so it holds a string object (in an implementation specific way):var s = "foo"This is a pointer to a string variable:var p = &sThe predeclared string type isstring
.Types are yet another topic, but let me follow: Then a pointer to string (which you call for reasons not known to me a 'string object") is not the same type as string is and so the string immutability guarantee has nothing to do with it, right?
Those three things are very/really/completely different. You are observing changes made to a variable (through dereferencing a pointer, though that's not important here). Variables are mutable (that't their essence). String objects are not mutable. Per specs, per previously shown examples. Let's ignore the implementation races which can break the guarantee.I as a user don't know nor care what is "underlying original string".Actually in lot of places programs do care exactly about that. That's what string immutability, in any programming language supporting it, is for.
May be yes, maybe not, It's irrelevant.Just in case: I understand that a reasonable Go implementation implements strings as pointer to immutable buffer. But that does make them immutable.Go strings (objects, not to be confused with string variables) are immutable. Another way to think about the problem is for example: regardless of the Go implementation, the string value/object is just "somewhere" in the computer's memory. That very memory cannot be, once constructed, written to by a Go program (sans "usnafe") as long as that value is reachable.
A string variable, in contrast, can freely change its value, e.g. by "somehow" having the values of different string objects at different times.
I do not output 's', I output '*s', and '*s' is exactly the same string object, and its value is changing over time.
We don't know. It perfectly can be mutated w/o violating the language spec.struct string {char buf[255];int len;};is a conforming implementation. Specification is there only to define semantics, not to restrict implementations. So, there is not necessary a pointer/reference to something else in the string al all. String buffer is not necessary immutable.
On Monday, October 3, 2011 1:23:09 PM UTC+2, Dmitry Vyukov wrote:I do not output 's', I output '*s', and '*s' is exactly the same string object, and its value is changing over time.*s is not a string object. *s is a mutable variable which can "hold" string objects (note: you can't take an address of a string constant, so *string is guaranteed to be pointing to a string variable/field and never to a string [memory] "object" per se). Those string objects are immutable. I'm not an authority, the specs are - I'm just citing them (loosely ;-)We don't know. It perfectly can be mutated w/o violating the language spec.struct string {char buf[255];int len;};is a conforming implementation. Specification is there only to define semantics, not to restrict implementations. So, there is not necessary a pointer/reference to something else in the string al all. String buffer is not necessary immutable.That's not correct. Conforming implementation must (in the above implementation case example) guarantee that the char field (of the above struct) is not mutable (after the struct is created/initialized).
And that's the case of Go implementation (although the implementation details are different - [the immutable] semantics are the important thing here).
That's not correct. Conforming implementation must (in the above implementation case example) guarantee that the char field (of the above struct) is not mutable (after the struct is created/initialized).No, it does not have to, because you won't be able to tell the difference.And that's the case of Go implementation (although the implementation details are different - [the immutable] semantics are the important thing here).Indeed. Semantics of my string implementation are the same. You can't tell the difference.
string /variables/ are mutable; you can freely change their value, as the
term "variable" hints.
string /values/ are immutable; you cannot update the components of
a string value, ie
"hello"[2] = 'p'
x := "chat"; x[3] = 't'
are both illegal, while a subsequent
x = "chad"
is legal.
/Because/ strings are immutable, implementations can share storage
for the component bytes comprising the spelling of the string, rather than
having to make a fresh (possibly large) copy:
y := "some enormous string a long way down to the chemists."
ylette := y[10:20]
The Go implementation has both y and ylette referring to "the same"
array of bytes. Since one cannot update the bytes, updating inside y
can't change the value of ylette.
Hence Dmitry's "you can't tell the difference" [1].
(Of course this means that a one-character slice of a multi-megabyte
string might end up holding onto the whole big thing, depending on
interesting details of the garbage collector.)
I hope any sloppyness in my wording destructively interferes with
any in the spec.
Chris
[1] From inside safe code and assuming loadsa memory, no, more
than that.
--
Chris "for constructive effect" Dollin
On 3 October 2011 11:35, Dmitry Vyukov <dvy...@google.com> wrote:
> That's a very basic property that holds for basically all types - if you
> have a private object that nobody else have references to and you do not
> mutate it, then its value does not change. It's not immutability.
it doesn't hold for maps or slices, both of which i'd consider "mutable".
it depends whether we're talking about mutability of the *type*
or mutability of the *variable*. all variables in Go are mutable - strings
are no different - but only some types are mutable.
perhaps that distinction is the source of what seems like an argument at cross
purposes here.