I'm not I understand your question entirely. E.g., you say that
"knowing that something is immutable means that I can write code which
is parallel and updates to one reference of the object...." But when
something is immutable, it can not be updated.
I would say that in Go, constants are immutable. Constants can be
explicitly declared via const declarations, or they may be constant
literals. Constant literals include integers, floating point numbers,
complex numbers, and strings. This is all the same as in most other
languages (early versions of C permitted modifying string literals, but
current ones do not).
So when you ask which types are immutable, I'm not sure what question
you are asking. I would say that there is no such thing as an immutable
type in Go. In Go, you can declare a variable of any type, and you can
change the value of that variable. What is immutable in Go is
constants, not types.
You ask a specific question about creating a string in a loop, and
whether that creates something that has to be garbage collected. The
answer to that question is, it depends.
for i := 0; i < 10000; i++ {
fmt.Println("Hi")
}
That loop will not create any extra strings.
for i := 0; i < 10000; i++ {
a := "a"
a += s()
fmt.Println(a)
}
That loop will (probably) create a new string which has to be garbage
collected each time through the loop, even if s happens to be
func s() string {
return "b"
}
That is, even though the loop will always wind up printing the string
"ab", a new string constant will (probably) be created each time through
the loop, and the garbage collector will have to collect all those
constants.
Ian
I think you are using a different vocabulary than Philip. For example,
he asked "how can I do mutable update to strings in a loop?". This
most likely means to mutate the content of a string. Since this is
impossible in Go, the code has to use []byte.
In your response, you wrote that "updates to strings can not be
atomic". This suggests you mean a string variable, not the content of
a string.
... or I am missing something.
Immutability doesn't implies thread-safety.
I am not sure that I know what Mutability or Immutability is
anymore :) I don't see a definition of it here:
http://golang.org/doc/go_spec.html
My idea of mutability is that when two threads make updates to the
object, they both see the same thing.
Immutability is when one thread makes update to the object, the other
thread never sees the update. (COW). This gives thread safety.
Const, or constant is just that, you are prevented from doing update
to the object.
Is this really true? How do you know this, do you have example code?
do you get this from the documentation?
What do you mean by "a normal object?"
As I said in my previous reply, in Go, constants are immutable, and
variables are not. So if by "a normal object" you mean a variable, then
it is mutable. If you change a global variable in one goroutine, the
change will be seen by another goroutine.
The idea seems very simple to me, so I think there must be some
communication difficulty which is making it seem complex to you.
> I find the terms immutable and mutable difficult to understand for
> this go-lang, my idea of mutable is what you have in C, you have a
> pointer to some data ie, int *ref; if you want to update the ref you
> can do (*ref) = 5; (if my memory serves me). That's a mutable pointer,
> which I consider the same as a mutable reference.
As I said in my previous reply, Go is basically the same as C and most
other languages in this regard.
You say you find the terms difficult to understand for Go. The Go spec
uses the words "mutable" and "immutable" in exactly one place:
A _string type_ represents the set of string values. Strings behave
like arrays of bytes but are immutable: once created, it is
impossible to change the contents of a string.
This means that once you create a string value, you can not change that
value. You can of course change the value of a string _variable_.
However, once a string _value_ exists, it will never change. The same
is true of an integer value, of course. Once you create the integer
value 5, it will always be 5. You can of course change the value of an
integer variable, but you can't change a value.
This notion of an immutable string value is something which does not
exist in C. In C a string literal is immutable, but C does not have a
string type and therefore does not have string values. In C a string
literal does not have string type, it has the type "const char [N]" for
some N.
> Then secondly, how can I do mutable update to strings in a loop? It
> seems in the cases you have given, its going to produce a lot of
> rubbish to be garbage collected. It was a killer for a project in a
> company I was in (I didn't work on it). They had many loops which
> updated Java Strings without using StringBuffer, so the consequence
> was too much rubbish for the garbage collector, this was early 2000's.
> That was a serious problem. You might say that now, the garbage
> collectors are so good it doesn't matter, but it does matter for a
> computer game if your doing lots of data processing from frame to
> frame. For a business app maybe not.
Yes, pretty much the same problem can occur in Go. Programs normally
avoid this by using []byte rather than string.
Ian
+1
Moreover, even is string buffers are generally immutable in an
implementation, there is no reason to not mutate it when a compiler is
sure there is only one reference to the buffer.
I thought that for the change to be /guaranteed/ to be seen there had
to be some sort of synchronisation between the goroutines?
Chris
--
Chris "allusive" Dollin
To be clear, the spec does already say that indexing expressions in a
string are not addressable, or, rather, it lists all addressable
expressions, and a string index is not one of them.
I agree that the compiler can optimize string manipulations in some
cases. This is another area where escape analysis can help.
Ian
> On 14 November 2011 06:28, Ian Lance Taylor <ia...@google.com> wrote:
>> If you change a global variable in one goroutine, the
>> change will be seen by another goroutine.
>
> I thought that for the change to be /guaranteed/ to be seen there had
> to be some sort of synchronisation between the goroutines?
Yes. I was speaking loosely. I should probably have said "may" rather
than "will."
Ian
"String immutability" seems to be redundant in the spec and cause of
lot of confusion. Moreover, it does not affect observable behavior in
any way, so... I am not even sure what it means in the context of a
language specification, I think that an implementation that mutate
string buffers every now and then is still conforming... Moreover
guarantee of string buffer/value immutability makes little sense for a
user w/o guarantee of string value/buffer sharing which is not
provided. It seems that the aspect is better covered by Effective Go.
Every detail of this is wrong.
-rob
I think the answer is that there are no immutable types in Go,
there
is no copy on write (COW). Where-as Scala's types are usually
immutable and prefer immutability as a solution to multi-threading
(also through actors) and when written to, they do COW, Google Go
prefers a different concurrency paradigm to Scala. Google Go likes
mutex and communication through channels as described in concurrency.
http://golang.org/doc/effective_go.html#concurrency
Regarding Google Go's String class,
the string implementation prevents
updates to its internal string representation and that this is a
property of the go-lang String's encapsulation and
implementation.
From a string users point of view this appears to be a
normal string which is being updated.
Perhaps the Go-lang string isn't
such a good idea under some circumstances, such as looping to create
many strings as this might create a lot of strings for the garbage
collector.
It could be useful to have a mutable string class for Go,
like Java has a StringBuffer for its String class.
So, Google-go types and any struct/class you create by default are all
mutable,
but go encourages concurrency patterns which are not based on
COW, so mutability and immutability are not really a concern.
I come from a C, then C++, then Java, then Scala background, so my
ideas of mutability and immutability come from there. In particular
more recently Scala.
Please correct me if I am wrong.
> Hi Rob,
>
> Well, I'll just quote at other webpages and you can work out if its
> right or wrong.
>
> From http://en.wikipedia.org/wiki/UTF-8
> "For every UTF-8 byte sequence corresponding to a single Unicode
> character, the first byte unambiguously indicates the length of the
> sequence in bytes"
>
> "UTF-8 is a variable-width encoding, with each character represented
> by one to four bytes."
>
> C style strings are null terminated
> http://en.wikipedia.org/wiki/C_string_handling
>
> Best Regards, Philip
>
> On Nov 15, 12:05 am, "Rob 'Commander' Pike" <r...@golang.org> wrote:
>> On Mon, Nov 14, 2011 at 3:03 AM, philip <philip14...@gmail.com> wrote:
>>> I believe its not a good idea to use a byte slice or bytes.Buffer for
>>> the string because of encoding issues,
It's a fine idea, the best idea.
>>> in particular the UTF-8 or
>>> UTF-16 encoding is different from a "C" style string which is just a
>>> string of bytes.
A Go string is also just a string of bytes. A Go string *constant* is encoded as UTF-8 (provided there are no \x escapes) but there is no requirement that a Go string have any particular encoding or represent any particular characters. It's just bytes, and in this way (although not some others, such as NUL termination) is equivalent to a C string.
>>> UTF-8 keeps the length of the string at the beginning
>>> of the string.
It does no such thing. I believe you misunderstand this sentence:
> "For every UTF-8 byte sequence corresponding to a single Unicode
> character, the first byte unambiguously indicates the length of the
> sequence in bytes"
That does not say the length of the string is at the beginning of the string. What it says is that for each encoded character, the length of the encoding can be determined by studying the first byte of that character's encoding.
>>
>> Every detail of this is wrong.
I stand by this statement.
-rob
> Utf-8 won't encode into a byte buffer as well as you suggest because
> the single character in utf-8 is typically 2 bytes, it can be between
> "UTF-8 is a variable-width encoding, with each character represented
> by one to four bytes.", (wikipedia). If I am using Chinese characters
> in Strings, then a single byte is not a good way to go. For the other
> issues I will read through and come back with another email.
UTF-8 encodes into a byte buffer just fine. In fact, it does not
readily encode into anything else. Yes, a Unicode character turns into
1 to 4 bytes in UTF-8. That just means that there is no one-to-one
correspondence between bytes and characters. It does not mean that you
can't use a byte buffer to represent UTF-8.
As evidence of this, many functions in http://golang.org/pkg/bytes deal
with UTF-8 in a byte buffer.
Ian
> I am totally worried about "Go it is []byte and/or bytes.Buffer." in
> regards to foreign languages, such as Chinese. If we are talking about
> a byte representing a character and a byte meaning 8 bits, then that's
> not a good idea for Chinese and many other languages. (I live in Hong
> Kong). If you do this (use 8 bit byte slices, is it slices here,
> arrays? ), then you decide to make your application multi-lingual,
> then your in trouble.
Just to be completely crystal clear, nobody is talking about "a byte
representing a character."
Go even recently introduced a name for the type which represents a
Unicode character: rune, which is a 32-bit signed integer. And Go does
in fact support using []rune to represent a string of Unicode
characters, if you feel uncomfortable using UTF-8 and []byte.
Ian
We're not.
Can you show us a specific problem you've actually got? I'm feeling
rather ungrounded.
(You have read the Go spec, yes?)
Hi there,
There is a lot of terminology mixed up here, I used the word class -
you don't have classes in go-lang, ok thats fine but you know what I
mean. I said that go-lang types are mutable, you said that types don't
mutate, well I know that, but do you know what I mean without me
having to be very specifically correct in the terminology. Instances
(variable values) of types can be mutable and I wanted to know
originally which types allow instances of those to be mutable. In
particular the string type doesn't allow instances of the string to be
mutable.
A string value is immutable specifically because it is of the type
string and how it is defined in the spec and how it is implemented in
the go language, so my question was back at stackoverflow "Which types
are mutable and immutable in the Google Go Language?". Which of
course, is a wrong question because types are not mutable, but
everyone understands that. If I said string class, you know I am
referring to string type. This is really just terminology, since you
don't have a class - you have types - then you already know what I
mean.
So, you are correct - about the terminology, but I think a lot of what
you said is correcting my terminology within the go-language context
and that we are talking about the same concepts.
that's not a problem. each time you append a character,
you can append more than one byte.
for example, here's some code that takes an array
of strings and concatenates them, separated by smiling faces:
import "bytes"
func smileys(a []string) string {
var b bytes.Buffer
for _, s := range a {
b.WriteString(s)
b.WriteRune('☺')
}
return b.String()
}
here's some code that does the same thing but using
a byte slice (requires weekly Go version for byte-slice string appending):
func smileys(a []string) string {
var b []byte
sep := []byte("☺")
for _, s := range a {
b = append(b, s...)
b = append(b, "☺"...)
}
return string(b)
}
> A lot of people said use byte buffer. I assume a byte means 8 bits.
>
> Just go back through the thread - I havn't listed them all.
>
> 1.Instead of using a string builder, you can either use (1) a byte
> slice and
> append, or (2) a bytes.Buffer. T
>
> 2. Use a bytes.Buffer to avoid that problem if you are going to
> generate large strings.
>
> 3. Like tux21b wrote: you can either use a byte slice and append, or a
> bytes.Buffer.
>
> 4. think you are using a different vocabulary than Philip. For
> example,
> he asked "how can I do mutable update to strings in a loop?". This
> most likely means to mutate the content of a string. Since this is
> impossible in Go, the code has to use []byte.
1. Please consider that we know what we are talking about.
2. Please reread earlier messages until they make sense.
3. If you still have questions, please ask specific questions, ideally
with code examples, rather than general questions using unclear
vocabulary.
Thanks.
Ian
>
> The problem is:
>
> 1. In the go spec it says the size of a byte is 8 bits, at the bottom
> of the spec.
> http://golang.org/doc/go_spec.html
>
> type size in bytes
> byte, uint8, int8 1
>
> 2. People in this thread said to use a byte buffer when doing string
> appending (in response to my complaint about potential garbage
> collection problems in a loop).
>
> 3. UTF-8 string chars can be between 2 to 4 bytes in length.
Yes. And these 2 to 4 8-bit bytes will be stored as 2 to 4 8-bit bytes in the bytes.Buffer, the same way they're stored as 2 to 4 8-bit bytes in a string, so UTF-8 doesn't care. It's not like there are variable sized elements in the string that you're trying to fit into fixed sized 8-bit containers. Both types have fixed size 8-bit elements and leave it up to the encoding to group them into runes.
PS - Rob Pike wrote the first implementation of UTF-8 along with the encoding's inventor, Ken Thompson ;)
1. In the go spec it says the size of a byte is 8 bits, at the bottom
of the spec.
http://golang.org/doc/go_spec.html
type size in bytes
byte, uint8, int8 1
2. People in this thread said to use a byte buffer when doing string
appending (in response to my complaint about potential garbage
collection problems in a loop).
3. UTF-8 string chars can be between 2 to 4 bytes in length.
the notion that characters might be two bytes could be very detrimental to their Go understanding.