Which types are immutable?

1,033 views
Skip to first unread message

philip

unread,
Nov 12, 2011, 12:50:52 AM11/12/11
to golang-nuts
Hi,

I asked this question on StackOverflow but the answers don't seem to
answer my question yet.
http://stackoverflow.com/questions/8018081/which-types-are-mutable-and-immutable-in-the-google-go-language

Are all my concerns valid or not valid?

Regards, Philip

Ian Lance Taylor

unread,
Nov 12, 2011, 1:57:38 PM11/12/11
to philip, golang-nuts
philip <phili...@gmail.com> writes:

I'm not I understand your question entirely. E.g., you say that
"knowing that something is immutable means that I can write code which
is parallel and updates to one reference of the object...." But when
something is immutable, it can not be updated.

I would say that in Go, constants are immutable. Constants can be
explicitly declared via const declarations, or they may be constant
literals. Constant literals include integers, floating point numbers,
complex numbers, and strings. This is all the same as in most other
languages (early versions of C permitted modifying string literals, but
current ones do not).

So when you ask which types are immutable, I'm not sure what question
you are asking. I would say that there is no such thing as an immutable
type in Go. In Go, you can declare a variable of any type, and you can
change the value of that variable. What is immutable in Go is
constants, not types.

You ask a specific question about creating a string in a loop, and
whether that creates something that has to be garbage collected. The
answer to that question is, it depends.

for i := 0; i < 10000; i++ {
fmt.Println("Hi")
}

That loop will not create any extra strings.

for i := 0; i < 10000; i++ {
a := "a"
a += s()
fmt.Println(a)
}

That loop will (probably) create a new string which has to be garbage
collected each time through the loop, even if s happens to be

func s() string {
return "b"
}

That is, even though the loop will always wind up printing the string
"ab", a new string constant will (probably) be created each time through
the loop, and the garbage collector will have to collect all those
constants.

Ian

tux21b

unread,
Nov 12, 2011, 3:31:19 PM11/12/11
to golan...@googlegroups.com
Instead of using a string builder, you can either use (1) a byte slice and append, or (2) a bytes.Buffer. That's much better than adding parts one by one. Go will behave similar to Java in this aspect.

Another thing which bothers me is the word "parallel" in your question at stackoverflow. If you are referring to "immutable data structures" instead of constants, than Go's string type will probably not meet your requirements, since updating a string is not atomic (you have to change the address to the data array and the length of the string). If you need such data structures, you will need to implement it on your own (e.g. by storing a pointer to an "immutable string" instead).

-christoph

philip

unread,
Nov 13, 2011, 8:01:19 AM11/13/11
to golang-nuts
Hi Ian,

Regarding the question "I'm not I understand your question entirely.
E.g., you say that
"knowing that something is immutable means that I can wr..."

I mean that if i am thinking of two threads (but you use go-routines
in go) then if one thread has a reference to a object and another
thread has a reference to the object, then if thread A updates the
object, thread A will see its own update, but thread B will see the
original object, in the immutability case. I consider this case 1.
In the mutable case- or how I see mutability, I consider case 2, if
they both have a reference to the object and thread A updates it, then
thread B will see the update some time after A updated it, and this
may not be thread safe. However, maybe this is something I want to be
able to do. If you are thinking from a C language point of view, if
they both have a pointer to some memory and one updates it, then they
both see the update at some point.

This is the consequence of immutability which I am concerned about. So
my question is basically, are we in case 1 or case 2 for normal
objects? As this really does matter for how you treat them IF they are
shared object references across threads. Your probability going to say
that's a bad idea to share references across threads and that the
resources should be encapsulated by message passing actors, well you
don't have actors, you have something similar in go-lang, at least in
Scala thats how they would answer and say you need to use Actors if
you want mutable state.

I find the terms immutable and mutable difficult to understand for
this go-lang, my idea of mutable is what you have in C, you have a
pointer to some data ie, int *ref; if you want to update the ref you
can do (*ref) = 5; (if my memory serves me). That's a mutable pointer,
which I consider the same as a mutable reference.

Then secondly, how can I do mutable update to strings in a loop? It
seems in the cases you have given, its going to produce a lot of
rubbish to be garbage collected. It was a killer for a project in a
company I was in (I didn't work on it). They had many loops which
updated Java Strings without using StringBuffer, so the consequence
was too much rubbish for the garbage collector, this was early 2000's.
That was a serious problem. You might say that now, the garbage
collectors are so good it doesn't matter, but it does matter for a
computer game if your doing lots of data processing from frame to
frame. For a business app maybe not.

Regards, Philip




On Nov 13, 2:57 am, Ian Lance Taylor <i...@google.com> wrote:
> philip <philip14...@gmail.com> writes:
> > I asked this question on StackOverflow but the answers don't seem to
> > answer my question yet.
> >http://stackoverflow.com/questions/8018081/which-types-are-mutable-an...

tux21b

unread,
Nov 13, 2011, 12:23:55 PM11/13/11
to golan...@googlegroups.com
Hi philip,

I think I've already answered your questions, but maybe you didn't get it. So I will
try a 2nd time. With an example this time: http://play.golang.org/p/C4_Rpatzrj

Go has similarities with Java here, so you might encounter the same problem
you have described with Go too. Use a bytes.Buffer to avoid that problem if you
are going to generate large strings.


And about your 2nd questions about updating strings concurrently: Don't do that in
Go!

You can not compare Go to Scala here. Immutable data structures (which are
quite popular in Scala) are using a trick. They are just storing a (mutable) pointer
to another object (e.g. a big list, a tree or a string) which they consider immutable.
On updates, they follow the COW (copy-on-write) idiom, which means that
they are not changing the data directly. They make a copy of the whole (or part
of the data, depending on the data structure) first  and then flip the pointer
atomically. Threads might still use an older version of the object for reading. (And
after some time, the GC takes care of it).

You can implement such things in Go too. Just store pointers and use the sync.atomic
package to change them atomically (the functions in this package will also ensure
visibility and the memory order). However, unlike Scala, you have to implement that
on your own.

Go's string type is not suited for COW. Strings are defined by a pointer to an underlying
array AND an integer which stores the length of the string. So, updates to strings
can not be atomic. If you update a string concurrently from different threads / goroutines
your program might behave unexpected or crash.

If you really need to do such a thing, either use (a) a single goroutine for that + message
passing OR (b) a sync.Mutex to lock the string before writing OR (c) implement your own
COW data structure (similar to Scala) on top of string. It's quite easy too (just use a pointer
+ atomic.LoadUintptr() / atomic.StoreUintptr()).

Regards,
Christoph

unread,
Nov 13, 2011, 1:00:03 PM11/13/11
to golang-nuts
On Nov 13, 2:01 pm, philip <philip14...@gmail.com> wrote:
> [cut]
> Your probability going to say
> that's a bad idea to share references across threads and that the

In my opinion, sharing references across threads is a useful idea. But
it has to be done in a controlled way. If using clear and simple rules
for controlling concurrent access to shared data, and the structure of
the shared data is simple, and there is a clear trade-off (for
example: program performance, code clarity, etc), shared data is a
good idea.

> resources should be encapsulated by message passing actors, well you
> don't have actors, you have something similar in go-lang, at least in
> Scala thats how they would answer and say you need to use Actors if
> you want mutable state.
>
> I find the terms immutable and mutable difficult to understand for
> this go-lang, my idea of mutable is what you have in C, you have a
> pointer to some data ie, int *ref; if you want to update the ref you
> can do (*ref) = 5; (if my memory serves me). That's a mutable pointer,
> which I consider the same as a mutable reference.

*ref=5 isn't mutating a pointer, it is mutating an integer.

To mutate the pointer variable in Go, write: ref=new(int)

> Then secondly, how can I do mutable update to strings in a loop?

You cannot. Go strings are immutable.

Like tux21b wrote: you can either use a byte slice and
append, or a bytes.Buffer.

See also the http://golang.org/pkg/bytes/ package

If you need to know the internal representation of Go values, you can
examine the C source code at http://golang.org/src/pkg/runtime/runtime.h

unread,
Nov 13, 2011, 1:06:52 PM11/13/11
to golang-nuts
On Nov 13, 6:23 pm, tux21b <tux...@gmail.com> wrote:
> I think I've already answered your questions, but maybe you didn't get it.

The above sentence seems a bit offensive to me.

tux21b

unread,
Nov 13, 2011, 1:12:56 PM11/13/11
to golan...@googlegroups.com
Oh sorry for that. I just meant that this post is just are more detailed answer but
basically the same. Also note that English is not my primary language, so it might
be entirely my fault.

-christoph

unread,
Nov 13, 2011, 2:03:33 PM11/13/11
to golang-nuts
On Nov 13, 7:12 pm, tux21b <tux...@gmail.com> wrote:
> Oh sorry for that. I just meant that this post is just are more detailed
> answer but

(No harm done)

> basically the same. Also note that English is not my primary language, so
> it might
> be entirely my fault.

About the string updates:

I think you are using a different vocabulary than Philip. For example,
he asked "how can I do mutable update to strings in a loop?". This
most likely means to mutate the content of a string. Since this is
impossible in Go, the code has to use []byte.

In your response, you wrote that "updates to strings can not be
atomic". This suggests you mean a string variable, not the content of
a string.

... or I am missing something.

tux21b

unread,
Nov 13, 2011, 3:02:05 PM11/13/11
to golan...@googlegroups.com
On Sunday, November 13, 2011 8:03:33 PM UTC+1, ⚛ wrote:
I think you are using a different vocabulary than Philip. For example,
he asked "how can I do mutable update to strings in a loop?". This
most likely means to mutate the content of a string. Since this is
impossible in Go, the code has to use []byte.
 
Yes, I agree there and I have suggested the same (a bytes.Buffer)
is basically the same. []byte would work too.
 
In your response, you wrote that "updates to strings can not be
atomic". This suggests you mean a string variable, not the content of
a string.

Yes, that's about the string variable if you want to call it so.

But I think here is also the root of the misunderstanding.

1) String values are immutable in Go, so adding something to a string will
create a new one.

2) Scala also offers immutable data structures (in addition to string, they
also offer sets, trees and several other ones). Their "values" are immutable
too, and the advantage of their immutable implementation is basically that
you can use COW (copy-on-write) instead of expensive locking / unlocking
mechanisms for concurrent access.

In my opinion, philip has also asked for that (he talked about parallelism,
Scala, different threads and concurrent updates).

That's however a completely different question and the word  "immutable"
means different things in Scala and Go (and, as Ian said, it might also mean
constants in Go. *g*).

Immutability doesn't implies thread-safety.

In my post above, I have tried to summarize the differences between
Scala's immutable data-structures and Go's immutable string.

... or I am missing something.

I am not sure. The thread (and also the stackoverflow questions) seems to
be full of misconceptions :D

-christoph

Jan Mercl

unread,
Nov 13, 2011, 3:38:03 PM11/13/11
to golan...@googlegroups.com
On Sunday, November 13, 2011 9:02:05 PM UTC+1, tux21b wrote:
Immutability doesn't implies thread-safety.

Immutability of some value (semantically equivalent to guaranteed read only access *after* either static or dynamic instantiation) does imply thread safety. As you have correctly said, string values are immutable and those values are thus thread safe. String variables (internally referring to string values), on the other side also correctly stated by you, are mutable - as all variables are. But any variable which a CPU doesn't mutate atomically (often the case for non machine word sized variables or non bus/memory/caches locked access generally) is not thread safe in any shared memory model language, which Go happens to be.

philip

unread,
Nov 13, 2011, 8:27:56 PM11/13/11
to golang-nuts
I am not sure that I know what Mutability or Immutability is
anymore :) I don't see a definition of it here:
http://golang.org/doc/go_spec.html

My idea of mutability is that when two threads make updates to the
object, they both see the same thing.
Immutability is when one thread makes update to the object, the other
thread never sees the update. (COW). This gives thread safety.
Const, or constant is just that, you are prevented from doing update
to the object.

Without a clear definition this is difficult, can someone give a clear
definition of these terms in Go language?

philip

unread,
Nov 13, 2011, 9:13:01 PM11/13/11
to golang-nuts
Is this really true? How do you know this, do you have example code?
do you get this from the documentation?

"Go's string type is not suited for COW. Strings are defined by a
pointer to
an underlying
array AND an integer which stores the length of the string. So,
updates to
strings
can not be atomic. If you update a string concurrently from different
threads / goroutines
your program might behave unexpected or crash. "



philip

unread,
Nov 13, 2011, 8:47:02 PM11/13/11
to golang-nuts
Your:

// not a good idea
s := ""
for i := 0; i <= 20; i++ {
s += "foo"
}

Is exactly what most people are going to do.

On Nov 14, 1:23 am, tux21b <tux...@gmail.com> wrote:

philip

unread,
Nov 13, 2011, 8:29:25 PM11/13/11
to golang-nuts
Yes correct, I should not have said "That's a mutable pointer". I
mean't "That's a mutable int, referenced by a pointer".

philip

unread,
Nov 13, 2011, 8:56:31 PM11/13/11
to golang-nuts
Hi Christoph,

"That's however a completely different question and the word
"immutable"
means different things in Scala and Go" - can you define what mutable/
immutable means in Go?

Thanks, Philip


On Nov 14, 4:02 am, tux21b <tux...@gmail.com> wrote:

Steven Blenkinsop

unread,
Nov 14, 2011, 12:38:21 AM11/14/11
to philip, golang-nuts
On Sun, Nov 13, 2011 at 8:27 PM, philip <phili...@gmail.com> wrote:
I am not sure that I know what Mutability or Immutability is
anymore :) I don't see a definition of it here:
http://golang.org/doc/go_spec.html
 
It might be clearer if the spec just said that indexing expressions on a string aren't addressable. The implications of this are fully defined in the spec. This is what it means when it says strings are immutable.
 
My idea of mutability is that when two threads make updates to the
object, they both see the same thing.
Immutability is when one thread makes update to the object, the other
thread never sees the update. (COW). This gives thread safety.
Const, or constant is just that, you are prevented from doing update
to the object.

This is a potential consequence of (im)mutability but not the meaning. Arrays are mutable. If I pass an array to another goroutine, though, I can't see any changes made to it because the other goroutine is operating on a copy of it. If I pass a reference to the array (via a slice or pointer), then the goroutine can alter my copy of the array, and I'll observe the result. On the other hand, if the array were immutable, then the other goroutine can't alter my copy via the reference (and neither can I). If it wants a changed version, it has to make a local copy. But, while it can't alter my copy, if it has access to my [mutable] reference to my copy, it can replace it with a reference to its copy, and I'll observe the change anyways.

In languages that enforce safety through immutability, this situation is prevented by making the reference immutable as well. Go doesn't have this.

So, yes, if you're worried about making copies, then use a bytes.Buffer or a []byte, rather than a string. But, the reason isn't to do with thread safety. It has to do with optimizations, such as putting strings in immutable memory and reusing them throughout the program, and being able to pass slices of a string without worrying about the original string changing. If you don't want this, then you have to use a mutable data type.


On Sun, Nov 13, 2011 at 9:13 PM, philip <phili...@gmail.com> wrote:
Is this really true? How do you know this, do you have example code?
do you get this from the documentation?

It's an implementation detail. It can work this way because the spec doesn't specify otherwise, and it would probably be inefficient to implement it any other way. Here's a blog post about Go data structures by one of the language designers:

Ian Lance Taylor

unread,
Nov 14, 2011, 1:28:52 AM11/14/11
to philip, golang-nuts
philip <phili...@gmail.com> writes:

What do you mean by "a normal object?"

As I said in my previous reply, in Go, constants are immutable, and
variables are not. So if by "a normal object" you mean a variable, then
it is mutable. If you change a global variable in one goroutine, the
change will be seen by another goroutine.

The idea seems very simple to me, so I think there must be some
communication difficulty which is making it seem complex to you.


> I find the terms immutable and mutable difficult to understand for
> this go-lang, my idea of mutable is what you have in C, you have a
> pointer to some data ie, int *ref; if you want to update the ref you
> can do (*ref) = 5; (if my memory serves me). That's a mutable pointer,
> which I consider the same as a mutable reference.

As I said in my previous reply, Go is basically the same as C and most
other languages in this regard.

You say you find the terms difficult to understand for Go. The Go spec
uses the words "mutable" and "immutable" in exactly one place:

A _string type_ represents the set of string values. Strings behave
like arrays of bytes but are immutable: once created, it is
impossible to change the contents of a string.

This means that once you create a string value, you can not change that
value. You can of course change the value of a string _variable_.
However, once a string _value_ exists, it will never change. The same
is true of an integer value, of course. Once you create the integer
value 5, it will always be 5. You can of course change the value of an
integer variable, but you can't change a value.

This notion of an immutable string value is something which does not
exist in C. In C a string literal is immutable, but C does not have a
string type and therefore does not have string values. In C a string
literal does not have string type, it has the type "const char [N]" for
some N.


> Then secondly, how can I do mutable update to strings in a loop? It
> seems in the cases you have given, its going to produce a lot of
> rubbish to be garbage collected. It was a killer for a project in a
> company I was in (I didn't work on it). They had many loops which
> updated Java Strings without using StringBuffer, so the consequence
> was too much rubbish for the garbage collector, this was early 2000's.
> That was a serious problem. You might say that now, the garbage
> collectors are so good it doesn't matter, but it does matter for a
> computer game if your doing lots of data processing from frame to
> frame. For a business app maybe not.

Yes, pretty much the same problem can occur in Go. Programs normally
avoid this by using []byte rather than string.

Ian

Dmitry Vyukov

unread,
Nov 14, 2011, 1:31:33 AM11/14/11
to Steven Blenkinsop, philip, golang-nuts
On Mon, Nov 14, 2011 at 8:38 AM, Steven Blenkinsop <stev...@gmail.com> wrote:
> On Sun, Nov 13, 2011 at 8:27 PM, philip <phili...@gmail.com> wrote:
>>
>> I am not sure that I know what Mutability or Immutability is
>> anymore :) I don't see a definition of it here:
>> http://golang.org/doc/go_spec.html
>
>
> It might be clearer if the spec just said that indexing expressions on a
> string aren't addressable. The implications of this are fully defined in the
> spec. This is what it means when it says strings are immutable.

+1
Moreover, even is string buffers are generally immutable in an
implementation, there is no reason to not mutate it when a compiler is
sure there is only one reference to the buffer.

chris dollin

unread,
Nov 14, 2011, 1:43:10 AM11/14/11
to Ian Lance Taylor, philip, golang-nuts
On 14 November 2011 06:28, Ian Lance Taylor <ia...@google.com> wrote:
>  If you change a global variable in one goroutine, the
> change will be seen by another goroutine.

I thought that for the change to be /guaranteed/ to be seen there had
to be some sort of synchronisation between the goroutines?

Chris

--
Chris "allusive" Dollin

Ian Lance Taylor

unread,
Nov 14, 2011, 1:56:45 AM11/14/11
to Dmitry Vyukov, Steven Blenkinsop, philip, golang-nuts
Dmitry Vyukov <dvy...@google.com> writes:

To be clear, the spec does already say that indexing expressions in a
string are not addressable, or, rather, it lists all addressable
expressions, and a string index is not one of them.

I agree that the compiler can optimize string manipulations in some
cases. This is another area where escape analysis can help.

Ian

Ian Lance Taylor

unread,
Nov 14, 2011, 1:57:55 AM11/14/11
to chris dollin, philip, golang-nuts
chris dollin <ehog....@googlemail.com> writes:

> On 14 November 2011 06:28, Ian Lance Taylor <ia...@google.com> wrote:
>>  If you change a global variable in one goroutine, the
>> change will be seen by another goroutine.
>
> I thought that for the change to be /guaranteed/ to be seen there had
> to be some sort of synchronisation between the goroutines?

Yes. I was speaking loosely. I should probably have said "may" rather
than "will."

Ian

Dmitry Vyukov

unread,
Nov 14, 2011, 2:08:40 AM11/14/11
to Ian Lance Taylor, Steven Blenkinsop, philip, golang-nuts

"String immutability" seems to be redundant in the spec and cause of
lot of confusion. Moreover, it does not affect observable behavior in
any way, so... I am not even sure what it means in the context of a
language specification, I think that an implementation that mutate
string buffers every now and then is still conforming... Moreover
guarantee of string buffer/value immutability makes little sense for a
user w/o guarantee of string value/buffer sharing which is not
provided. It seems that the aspect is better covered by Effective Go.

unread,
Nov 14, 2011, 5:34:27 AM11/14/11
to golang-nuts
On Nov 14, 2:29 am, philip <philip14...@gmail.com> wrote:
> Yes correct, I should not have said "That's a mutable pointer". I
> mean't "That's a mutable int, referenced by a pointer".

I am starting to think that we are playing with words too much.

Maybe it would be better if you posted a snippet of Go code.

philip

unread,
Nov 14, 2011, 8:53:15 AM11/14/11
to golang-nuts
I had to make a correction here because he said that I was talking
about something else, not what I meant exactly. Sometimes its
necessary to be more specific.

philip

unread,
Nov 14, 2011, 6:03:24 AM11/14/11
to golang-nuts
I believe its not a good idea to use a byte slice or bytes.Buffer for
the string because of encoding issues, in particular the UTF-8 or
UTF-16 encoding is different from a "C" style string which is just a
string of bytes. UTF-8 keeps the length of the string at the beginning
of the string.

philip

unread,
Nov 14, 2011, 6:00:34 AM11/14/11
to golang-nuts

Hi All,

I think the answer is that there are no immutable types in Go, there
is no copy on write (COW). Where-as Scala's types are usually
immutable and prefer immutability as a solution to multi-threading
(also through actors) and when written to, they do COW, Google Go
prefers a different concurrency paradigm to Scala. Google Go likes
mutex and communication through channels as described in concurrency.
http://golang.org/doc/effective_go.html#concurrency

Regarding Google Go's String class, the string implementation prevents
updates to its internal string representation and that this is a
property of the go-lang String's encapsulation and
implementation. From a string users point of view this appears to be a
normal string which is being updated. Perhaps the Go-lang string isn't
such a good idea under some circumstances, such as looping to create
many strings as this might create a lot of strings for the garbage
collector. It could be useful to have a mutable string class for Go,
like Java has a StringBuffer for its String class.
So, Google-go types and any struct/class you create by default are all
mutable, but go encourages concurrency patterns which are not based on
COW, so mutability and immutability are not really a concern.
I come from a C, then C++, then Java, then Scala background, so my
ideas of mutability and immutability come from there. In particular
more recently Scala.

Please correct me if I am wrong.
Regards, Philip


On Nov 12, 1:50 pm, philip <philip14...@gmail.com> wrote:
> Hi,
>
> I asked this question on StackOverflow but the answers don't seem to
> answer my question yet.http://stackoverflow.com/questions/8018081/which-types-are-mutable-an...
>
> Are all my concerns valid or not valid?
>
> Regards, Philip

Rob 'Commander' Pike

unread,
Nov 14, 2011, 11:05:11 AM11/14/11
to philip, golang-nuts
On Mon, Nov 14, 2011 at 3:03 AM, philip <phili...@gmail.com> wrote:
> I believe its not a good idea to use a byte slice or bytes.Buffer for
> the string because of encoding issues, in particular the UTF-8 or
> UTF-16 encoding is different from a "C" style string which is just a
> string of bytes. UTF-8 keeps the length of the string at the beginning
> of the string.

Every detail of this is wrong.

-rob

Jan Mercl

unread,
Nov 14, 2011, 11:33:55 AM11/14/11
to golan...@googlegroups.com
On Monday, November 14, 2011 12:00:34 PM UTC+1, philip wrote:
I think the answer is that there are no immutable types in Go,

Yes. But there are immutable values in Go. A string value is immutable.
 
there
is no copy on write (COW). Where-as Scala's types are usually
immutable and prefer immutability as a solution to multi-threading
(also through actors) and when written to, they do COW, Google Go
prefers a different concurrency paradigm to Scala. Google Go likes
mutex and communication through channels as described in concurrency.
http://golang.org/doc/effective_go.html#concurrency

Regarding Google Go's String class,

There's no such thing as a Go String class (nor any class).
 
the string implementation prevents
updates to its internal string representation and that this is a
property of the go-lang String's encapsulation and
implementation.

Not due to the implementation but because it is in the specification of the Go language. I think it's the same in Java. The catch is - Java is all reference types (except primitive types like int etc.), but in Go string is a value type (with an invisible internal reference to a string value).
 
 From a string users point of view this appears to be a
normal string which is being updated.

This obviously missed the difference between what a value and a variable is. If the string (value) would be updated then any previously existing string variables already referring to it would change their value - but this is not the case in Go.
 
Perhaps the Go-lang string isn't
such a good idea under some circumstances, such as looping to create
many strings as this might create a lot of strings for the garbage
collector.

Just don't concatenate strings in a loop. Use byte slices with append or bytes.Buffer (which does internally the same).
 
It could be useful to have a mutable string class for Go,
like Java has a StringBuffer for its String class.

It is useful and in Go it is []byte and/or bytes.Buffer.
 
So, Google-go types and any struct/class you create by default are all
mutable,

Not types. Variables. Variables do variate. That's what they are for.
 
but go encourages concurrency patterns which are not based on
COW, so mutability and immutability are not really a concern.
I come from a C, then C++, then Java, then Scala background, so my
ideas of mutability and immutability come from there. In particular
more recently Scala.

Please correct me if I am wrong.

More playing with Go in practice will surely be helpful. It may help to try not to think in terms/concepts of other languages when studying Go - it's a common source of confusion. Go beautifully differs in some details from other languages.

philip

unread,
Nov 14, 2011, 11:02:00 PM11/14/11
to golang-nuts
Reading the go-lang spec "; all other escapes represent the (possibly
multi-byte) UTF-8 encoding of individual characters. ". there is a
recognition that strings can be composed of multi-byte chars.
Also I now read in the go-lang docs, http://golang.org/pkg/bytes/ that
there is support for UTF-8.

So my statement "I believe its not a good idea to use a byte slice or
bytes.Buffer" looks to be wrong but the other details are correct. I
would be worried about code which create a byte array of a particular
length of bytes if it doesn't take into account the variable byte size
of the char in a UTF-8 encoded string (2-4 bytes). Chinese characters
don't fit easily into a 8 bit byte.


On Nov 15, 12:05 am, "Rob 'Commander' Pike" <r...@golang.org> wrote:

philip

unread,
Nov 14, 2011, 10:23:04 PM11/14/11
to golang-nuts
Hi Rob,

Well, I'll just quote at other webpages and you can work out if its
right or wrong.

From http://en.wikipedia.org/wiki/UTF-8
"For every UTF-8 byte sequence corresponding to a single Unicode
character, the first byte unambiguously indicates the length of the
sequence in bytes"

"UTF-8 is a variable-width encoding, with each character represented
by one to four bytes."

C style strings are null terminated
http://en.wikipedia.org/wiki/C_string_handling

Best Regards, Philip

On Nov 15, 12:05 am, "Rob 'Commander' Pike" <r...@golang.org> wrote:

philip

unread,
Nov 14, 2011, 10:28:44 PM11/14/11
to golang-nuts
Utf-8 won't encode into a byte buffer as well as you suggest because
the single character in utf-8 is typically 2 bytes, it can be between
"UTF-8 is a variable-width encoding, with each character represented
by one to four bytes.", (wikipedia). If I am using Chinese characters
in Strings, then a single byte is not a good way to go. For the other
issues I will read through and come back with another email.

Rob 'Commander' Pike

unread,
Nov 15, 2011, 12:57:48 AM11/15/11
to philip, golang-nuts

On Nov 14, 2011, at 7:23 PM, philip wrote:

> Hi Rob,
>
> Well, I'll just quote at other webpages and you can work out if its
> right or wrong.
>
> From http://en.wikipedia.org/wiki/UTF-8
> "For every UTF-8 byte sequence corresponding to a single Unicode
> character, the first byte unambiguously indicates the length of the
> sequence in bytes"
>
> "UTF-8 is a variable-width encoding, with each character represented
> by one to four bytes."
>
> C style strings are null terminated
> http://en.wikipedia.org/wiki/C_string_handling
>
> Best Regards, Philip
>
> On Nov 15, 12:05 am, "Rob 'Commander' Pike" <r...@golang.org> wrote:
>> On Mon, Nov 14, 2011 at 3:03 AM, philip <philip14...@gmail.com> wrote:
>>> I believe its not a good idea to use a byte slice or bytes.Buffer for
>>> the string because of encoding issues,

It's a fine idea, the best idea.

>>> in particular the UTF-8 or
>>> UTF-16 encoding is different from a "C" style string which is just a
>>> string of bytes.

A Go string is also just a string of bytes. A Go string *constant* is encoded as UTF-8 (provided there are no \x escapes) but there is no requirement that a Go string have any particular encoding or represent any particular characters. It's just bytes, and in this way (although not some others, such as NUL termination) is equivalent to a C string.

>>> UTF-8 keeps the length of the string at the beginning
>>> of the string.

It does no such thing. I believe you misunderstand this sentence:

> "For every UTF-8 byte sequence corresponding to a single Unicode
> character, the first byte unambiguously indicates the length of the
> sequence in bytes"

That does not say the length of the string is at the beginning of the string. What it says is that for each encoded character, the length of the encoding can be determined by studying the first byte of that character's encoding.

>>
>> Every detail of this is wrong.

I stand by this statement.

-rob


Ian Lance Taylor

unread,
Nov 15, 2011, 1:50:48 AM11/15/11
to philip, golang-nuts
philip <phili...@gmail.com> writes:

> Utf-8 won't encode into a byte buffer as well as you suggest because
> the single character in utf-8 is typically 2 bytes, it can be between
> "UTF-8 is a variable-width encoding, with each character represented
> by one to four bytes.", (wikipedia). If I am using Chinese characters
> in Strings, then a single byte is not a good way to go. For the other
> issues I will read through and come back with another email.

UTF-8 encodes into a byte buffer just fine. In fact, it does not
readily encode into anything else. Yes, a Unicode character turns into
1 to 4 bytes in UTF-8. That just means that there is no one-to-one
correspondence between bytes and characters. It does not mean that you
can't use a byte buffer to represent UTF-8.

As evidence of this, many functions in http://golang.org/pkg/bytes deal
with UTF-8 in a byte buffer.

Ian

philip

unread,
Nov 15, 2011, 1:01:28 AM11/15/11
to golang-nuts
Hi there,
There is a lot of terminology mixed up here, I used the word class -
you don't have classes in go-lang, ok thats fine but you know what I
mean. I said that go-lang types are mutable, you said that types don't
mutate, well I know that, but do you know what I mean without me
having to be very specifically correct in the terminology. Instances
(variable values) of types can be mutable and I wanted to know
originally which types allow instances of those to be mutable. In
particular the string type doesn't allow instances of the string to be
mutable.
A string value is immutable specifically because it is of the type
string and how it is defined in the spec and how it is implemented in
the go language, so my question was back at stackoverflow "Which types
are mutable and immutable in the Google Go Language?". Which of
course, is a wrong question because types are not mutable, but
everyone understands that. If I said string class, you know I am
referring to string type. This is really just terminology, since you
don't have a class - you have types - then you already know what I
mean.
So, you are correct - about the terminology, but I think a lot of what
you said is correcting my terminology within the go-language context
and that we are talking about the same concepts.
I am totally worried about "Go it is []byte and/or bytes.Buffer." in
regards to foreign languages, such as Chinese. If we are talking about
a byte representing a character and a byte meaning 8 bits, then that's
not a good idea for Chinese and many other languages. (I live in Hong
Kong). If you do this (use 8 bit byte slices, is it slices here,
arrays? ), then you decide to make your application multi-lingual,
then your in trouble.
Regards, Philip

On Nov 15, 12:33 am, Jan Mercl <jan.me...@nic.cz> wrote:

Ian Lance Taylor

unread,
Nov 15, 2011, 2:02:17 AM11/15/11
to philip, golang-nuts
philip <phili...@gmail.com> writes:

> I am totally worried about "Go it is []byte and/or bytes.Buffer." in
> regards to foreign languages, such as Chinese. If we are talking about
> a byte representing a character and a byte meaning 8 bits, then that's
> not a good idea for Chinese and many other languages. (I live in Hong
> Kong). If you do this (use 8 bit byte slices, is it slices here,
> arrays? ), then you decide to make your application multi-lingual,
> then your in trouble.

Just to be completely crystal clear, nobody is talking about "a byte
representing a character."

Go even recently introduced a name for the type which represents a
Unicode character: rune, which is a 32-bit signed integer. And Go does
in fact support using []rune to represent a string of Unicode
characters, if you feel uncomfortable using UTF-8 and []byte.

Ian

chris dollin

unread,
Nov 15, 2011, 2:38:50 AM11/15/11
to philip, golang-nuts
On 15 November 2011 06:01, philip <phili...@gmail.com> wrote:
>If we are talking about
> a byte representing a character and a byte meaning 8 bits,

We're not.

Can you show us a specific problem you've actually got? I'm feeling
rather ungrounded.

(You have read the Go spec, yes?)

Volker Dobler

unread,
Nov 15, 2011, 3:53:44 AM11/15/11
to golang-nuts
> Fromhttp://en.wikipedia.org/wiki/UTF-8
> "For every UTF-8 byte sequence corresponding to a single Unicode
> character, the first byte unambiguously indicates the length of the
> sequence in bytes"

Hey, that's cool: No more texts with more than 256 characters
How simple and fast will my life become...
You misinterpret "first byte": It's the first byte of a single
code point (rune in Go) which determines how many byte this
rune is encoded into.

This whole thread gets more and more out of control. May I kindly
ask to a) not to impose inappropriate terminology and techniques
from other languages on Go, and b) please do belief at least some
of the things the experts tell you?

To a): Attributes like hair colour, date of birth, weight and
spoken languages are suitable attributes to describe humans but
it's plain stupid to try and describe electrons by hair colour
(even if they do have weight/mass). Some concepts just do not
carry over. It's the same with programming languages.

To the whole "immutability" stuff: Go is very simple here, it's
string type behaves much like Javas or pythons string (they
are immutable :-) the should be used the same way (use StringBuffer
or "".join() to construct strings) and all this has nothing to
do with threadsafty. And maybe Haskell or Scala do it complete the
other way around (syntax and representation in memory) but that's
fine and okay.

Regards, Volker

Volker Dobler

unread,
Nov 15, 2011, 4:14:01 AM11/15/11
to golang-nuts


On Nov 15, 7:01 am, philip <philip14...@gmail.com> wrote:
> [...] so my question was back at stackoverflow "Which types
> are mutable and immutable in the Google Go Language?".

Answer: Strings are immutable.

Additional answer: This information is completely irrelevant
to write nice, localized, i18ned programs which work for all
languages. It's even irrelevant to any normal day programming.
Your compiler will tell you if you do something which cannot
be done (because something is immutable).

One more answer: Immutability has _nothing_ to do with
concurrency, threadsafty, atomicity and so on. If you need
such stuff you may either: Have clear ownership of any object
(immutable or not), e.g. by transferring ownership via channels
or protecting read/writes via a Mutex.

> I am totally worried about "Go it is []byte and/or bytes.Buffer." in
> regards to foreign languages, such as Chinese. If we are talking about
> a byte representing a character and a byte meaning 8 bits, then that's
> not a good idea for Chinese and many other languages. (I live in Hong
> Kong). If you do this (use 8 bit byte slices, is it slices here,
> arrays? ), then you decide to make your application multi-lingual,
> then your in trouble.

No need to worry. Everything is fine. All text is stored properly.
You may use UTF-8, UTF-16, and others. Strings are UTF-8. There's
a special "character type" (rune) which works for all languages.
But sometimes you need to write this proper multilanguage any
characterset to disc or transmit via a network and here comes the
byte and package bytes.

Take a look at the Go source: Something like
s := "abc" + someString + "; " + otherString + "."
is not executed as
s1 := "abc" + someString
s2 := s1 + "; "
s3 := s2 + otherString
s := s3 + "."
but in a single memory allocation.
Just use simple string concatenation and if your program is to
slow, fails, does more garbage collection than anything else:
Post your code here and we will show you how to benefit from
bytes.Buffer _without_ affecting any localization, i18n, encoding
or character sets. Promised.

Regrads, Volker

unread,
Nov 15, 2011, 4:52:33 AM11/15/11
to golang-nuts
On Nov 15, 7:01 am, philip <philip14...@gmail.com> wrote:
> I am totally worried about "Go it is []byte and/or bytes.Buffer." in
> regards to foreign languages, such as Chinese. If we are talking about
> a byte representing a character and a byte meaning 8 bits, then that's
> not a good idea for Chinese and many other languages. (I live in Hong
> Kong). If you do this (use 8 bit byte slices, is it slices here,
> arrays? ), then you decide to make your application multi-lingual,
> then your in trouble.

Question: Are you assuming that to copy an UTF-8 character from a
source to a destination you would always write Go code like:

dest[i] = src[j]

Please answer the above question.

Jan Mercl

unread,
Nov 15, 2011, 5:47:51 AM11/15/11
to golan...@googlegroups.com
On Tuesday, November 15, 2011 7:01:28 AM UTC+1, philip wrote:
Hi there,
There is a lot of terminology mixed up here, I used the word class -
you don't have classes in go-lang, ok thats fine but you know what I
mean. I said that go-lang types are mutable, you said that types don't
mutate, well I know that, but do you know what I mean without me
having to be very specifically correct in the terminology. Instances
(variable values) of types can be mutable and I wanted to know
originally which types allow instances of those to be mutable. In
particular the string type doesn't allow instances of the string to be
mutable.
A string value is immutable specifically because it is of the type
string and how it is defined in the spec and how it is implemented in
the go language, so my question was back at stackoverflow "Which types
are mutable and immutable in the Google Go Language?". Which of
course, is a wrong question because types are not mutable, but
everyone understands that. If I said string class, you know I am
referring to string type. This is really just terminology, since you
don't have a class - you have types - then you already know what I
mean.
So, you are correct - about the terminology, but I think a lot of what
you said is correcting my terminology within the go-language context
and that we are talking about the same concepts.

Admittedly I may sound like nitpicking on terminology, but it was only an attempt to aid understanding.

Go "immutable" strings mysteries simplified as much as I can:

a) There are no immutable variables in Go.
b) Every variable has a type so there are also no immutable types in Go.
c) What is immutable in Go is a ([]byte) value which is via internal reference hold by a string variable. The variable per se is still mutable (a).
d) Mutating the string variable never mutates the []byte value referenced by that variable.
e) Because of d), string variables can share slices of the []byte value where possible, like after "t, u := s[i:j], s[:3]", 's', 't' and 'u' will have all together only one backing []byte value.

philip

unread,
Nov 15, 2011, 3:58:51 AM11/15/11
to golang-nuts
Hi,

A lot of people said use byte buffer. I assume a byte means 8 bits.

Just go back through the thread - I havn't listed them all.

1.Instead of using a string builder, you can either use (1) a byte
slice and
append, or (2) a bytes.Buffer. T

2. Use a bytes.Buffer to avoid that problem if you are going to
generate large strings.

3. Like tux21b wrote: you can either use a byte slice and append, or a
bytes.Buffer.

4. think you are using a different vocabulary than Philip. For
example,
he asked "how can I do mutable update to strings in a loop?". This
most likely means to mutate the content of a string. Since this is
impossible in Go, the code has to use []byte.

Philip


On Nov 15, 3:02 pm, Ian Lance Taylor <i...@google.com> wrote:

philip

unread,
Nov 15, 2011, 4:09:22 AM11/15/11
to golang-nuts

The problem is:

1. In the go spec it says the size of a byte is 8 bits, at the bottom
of the spec.
http://golang.org/doc/go_spec.html

type size in bytes
byte, uint8, int8 1

2. People in this thread said to use a byte buffer when doing string
appending (in response to my complaint about potential garbage
collection problems in a loop).

3. UTF-8 string chars can be between 2 to 4 bytes in length.



On Nov 15, 3:38 pm, chris dollin <ehog.he...@googlemail.com> wrote:

philip

unread,
Nov 15, 2011, 4:23:42 AM11/15/11
to golang-nuts

Immutability and mutability does have a relation to thread safety, if
you come from a C/C++ background then a mutable variable accessed by
two threads, updated by one without locking mutexes is a problem,
using locking is ok but has other problems, deadlock and efficiency.
Immutable variable in one thread in Scala does not lead to any problem
with the other thread, since the update does not change the other
threads referenced object because of COW (as discussed). These terms
Mutable and Immutable are important in Scala and are important for
concurrency in Scala, absolutely. So its not correct to say that
mutability, immutability is not related to concurrency. However, in
the Go Language you don't do it in the same way as Scala, you do it
using channels, mutex and dataflow like programming.

roger peppe

unread,
Nov 15, 2011, 7:36:19 AM11/15/11
to philip, golang-nuts
On 15 November 2011 09:09, philip <phili...@gmail.com> wrote:
>
> The problem is:
>
> 1. In the go spec it says the size of a byte is 8 bits, at the bottom
> of the spec.
> http://golang.org/doc/go_spec.html
>
> type                                 size in bytes
> byte, uint8, int8                     1
>
> 2. People in this thread said to use a byte buffer when doing string
> appending (in response to my complaint about potential garbage
> collection problems in a loop).
>
> 3. UTF-8 string chars can be between 2 to 4 bytes in length.

that's not a problem. each time you append a character,
you can append more than one byte.

for example, here's some code that takes an array
of strings and concatenates them, separated by smiling faces:

import "bytes"
func smileys(a []string) string {
var b bytes.Buffer
for _, s := range a {
b.WriteString(s)
b.WriteRune('☺')
}
return b.String()
}

here's some code that does the same thing but using
a byte slice (requires weekly Go version for byte-slice string appending):

func smileys(a []string) string {
var b []byte
sep := []byte("☺")
for _, s := range a {
b = append(b, s...)
b = append(b, "☺"...)
}
return string(b)
}

Volker Dobler

unread,
Nov 15, 2011, 7:53:45 AM11/15/11
to golang-nuts


On Nov 15, 10:23 am, philip <philip14...@gmail.com> wrote:
> Immutability and mutability does have a relation to thread safety, if
> you come from a C/C++ background then a mutable variable accessed by
> two threads, updated by one without locking mutexes is a problem,
> using locking is ok but has other problems, deadlock and efficiency.
> Immutable variable in one thread in Scala does not lead to any problem
> with the other thread, since the update does not change the other
> threads referenced object because of COW (as discussed). These terms
> Mutable and Immutable are important in Scala and are important for
> concurrency in Scala, absolutely. So its not correct to say that
> mutability, immutability is not related to concurrency. However, in
> the Go Language you don't do it in the same way as Scala, you do it
> using channels, mutex and dataflow like programming.

Go is not any of C, C++ or Scala.
Go has no copy on write
Just forget about immutability, it is a concept which is
plain _inappropriate_ for Go. Talking about immutability
will not take you any further and not help understanding
what's the solution in Go to all your problems.
Your background of C, C++ and Scala is valuable, but just
don't carry over stuff to Go which does not suite.

Regards, Volker

Ian Lance Taylor

unread,
Nov 15, 2011, 12:01:33 PM11/15/11
to philip, golang-nuts
philip <phili...@gmail.com> writes:

> A lot of people said use byte buffer. I assume a byte means 8 bits.
>
> Just go back through the thread - I havn't listed them all.
>
> 1.Instead of using a string builder, you can either use (1) a byte
> slice and
> append, or (2) a bytes.Buffer. T
>
> 2. Use a bytes.Buffer to avoid that problem if you are going to
> generate large strings.
>
> 3. Like tux21b wrote: you can either use a byte slice and append, or a
> bytes.Buffer.
>
> 4. think you are using a different vocabulary than Philip. For
> example,
> he asked "how can I do mutable update to strings in a loop?". This
> most likely means to mutate the content of a string. Since this is
> impossible in Go, the code has to use []byte.


1. Please consider that we know what we are talking about.

2. Please reread earlier messages until they make sense.

3. If you still have questions, please ask specific questions, ideally
with code examples, rather than general questions using unclear
vocabulary.

Thanks.

Ian

David Leimbach

unread,
Nov 15, 2011, 12:37:02 PM11/15/11
to golang-nuts
Immutable data means you can't update the data a variable is a handle
to once it's bound. Talking about changing a variable in one thread
and when another thread sees the update has nothing at all to do with
immutability, but a multi-threaded memory model. Maybe you should
rephrase your question with the right terminology so people aren't
confused.


On Nov 13, 5:01 am, philip <philip14...@gmail.com> wrote:
> Hi Ian,
>
> Regarding the question "I'm not I understand your question entirely.
> E.g., you say that
> "knowing that something is immutable means that I can wr..."
>
> I mean that if i am thinking of two threads (but you use go-routines
> in go) then if one thread has a reference to a object and another
> thread has a reference to the object, then if thread A updates the
> object, thread A will see its own update, but thread B will see the
> original object, in the immutability case. I consider this case 1.
> In the mutable case- or how I see mutability, I consider case 2, if
> they both have a reference to the object and thread A updates it, then
> thread B will see the update some time after A updated it, and this
> may not be thread safe. However, maybe this is something I want to be
> able to do. If you are thinking from a C language point of view, if
> they both have a pointer to some memory and one updates it, then they
> both see the update at some point.



>
> This is the consequence of immutability which I am concerned about. So
> my question is basically, are we in case 1 or case 2 for normal
> objects? As this really does matter for how you treat them IF they are
> shared object references across threads. Your probability going to say
> that's a bad idea to share references across threads and that the
> resources should be encapsulated by message passing actors, well you
> don't have actors, you have something similar in go-lang, at least in
> Scala thats how they would answer and say you need to use Actors if
> you want mutable state.
>
> I find the terms immutable and mutable difficult to understand for
> this go-lang, my idea of mutable is what you have in C, you have a
> pointer to some data ie, int *ref; if you want to update the ref you
> can do (*ref) = 5; (if my memory serves me). That's a mutable pointer,
> which I consider the same as a mutable reference.
>
> Then secondly, how can I do mutable update to strings in a loop? It
> seems in the cases you have given, its going to produce a lot of
> rubbish to be garbage collected. It was a killer for a project in a
> company I was in (I didn't work on it). They had many loops which
> updated Java Strings without using StringBuffer, so the consequence
> was too much rubbish for the garbage collector, this was early 2000's.
> That was a serious problem. You might say that now, the garbage
> collectors are so good it doesn't matter, but it does matter for a
> computer game if your doing lots of data processing from frame to
> frame. For a business app maybe not.
>
> Regards, Philip
>
> On Nov 13, 2:57 am, Ian Lance Taylor <i...@google.com> wrote:
>
>
>
>
>
>
>
> > philip <philip14...@gmail.com> writes:
> > > I asked this question on StackOverflow but the answers don't seem to
> > > answer my question yet.
> > >http://stackoverflow.com/questions/8018081/which-types-are-mutable-an...
>
> > > Are all my concerns valid or not valid?
>
> > I'm not I understand your question entirely.  E.g., you say that
> > "knowing that something is immutable means that I can write code which
> > is parallel and updates to one reference of the object...."  But when
> > something is immutable, it can not be updated.
>
> > I would say that in Go, constants are immutable.  Constants can be
> > explicitly declared via const declarations, or they may be constant
> > literals.  Constant literals include integers, floating point numbers,
> > complex numbers, and strings.  This is all the same as in most other
> > languages (early versions of C permitted modifying string literals, but
> > current ones do not).
>
> > So when you ask which types are immutable, I'm not sure what question
> > you are asking.  I would say that there is no such thing as an immutable
> > type in Go.  In Go, you can declare a variable of any type, and you can
> > change the value of that variable.  What is immutable in Go is
> > constants, not types.
>
> > You ask a specific question about creating a string in a loop, and
> > whether that creates something that has to be garbage collected.  The
> > answer to that question is, it depends.
>
> >         for i := 0; i < 10000; i++ {
> >                 fmt.Println("Hi")
> >         }
>
> > That loop will not create any extra strings.
>
> >         for i := 0; i < 10000; i++ {
> >                 a := "a"
> >                 a += s()
> >                 fmt.Println(a)
> >         }
>
> > That loop will (probably) create a new string which has to be garbage
> > collected each time through the loop, even if s happens to be
>
> > func s() string {
> >         return "b"
>
> > }
>
> > That is, even though the loop will always wind up printing the string
> > "ab", a new string constant will (probably) be created each time through
> > the loop, and the garbage collector will have to collect all those
> > constants.
>
> > Ian

David Leimbach

unread,
Nov 15, 2011, 12:38:48 PM11/15/11
to golang-nuts
You want to talk about the memory model of Go, which is well
documented.

http://golang.org/doc/go_mem.html

On Nov 13, 5:27 pm, philip <philip14...@gmail.com> wrote:
> I am not sure that I know what Mutability or Immutability is
> anymore :) I don't see a definition of it here:http://golang.org/doc/go_spec.html
>
> My idea of mutability is that when two threads make updates to the
> object, they both see the same thing.
> Immutability is when one thread makes update to the object, the other
> thread never sees the update. (COW). This gives thread safety.
> Const, or constant is just that, you are prevented from doing update
> to the object.
>
> Without a clear definition this is difficult, can someone give a clear
> definition of these terms in Go language?
>
> On Nov 14, 4:02 am, tux21b <tux...@gmail.com> wrote:
>
>
>
>
>
>
>
> > On Sunday, November 13, 2011 8:03:33 PM UTC+1, ⚛ wrote:
>
> > > I think you are using a different vocabulary than Philip. For example,
> > > he asked "how can I do mutable update to strings in a loop?". This
> > > most likely means to mutate the content of a string. Since this is
> > > impossible in Go, the code has to use []byte.
>
> > Yes, I agree there and I have suggested the same (a bytes.Buffer)
> > is basically the same. []byte would work too.
>
> > > In your response, you wrote that "updates to strings can not be
> > > atomic". This suggests you mean a string variable, not the content of
> > > a string.
>
> > Yes, that's about the string variable if you want to call it so.
>
> > But I think here is also the root of the misunderstanding.
>
> > 1) String values are immutable in Go, so adding something to a string will
> > create a new one.
>
> > 2) Scala also offers immutable data structures (in addition to string, they
> > also offer sets, trees and several other ones). Their "values" are immutable
> > too, and the advantage of their immutable implementation is basically that
> > you can use COW (copy-on-write) instead of expensive locking / unlocking
> > mechanisms for concurrent access.
>
> > In my opinion, philip has also asked for that (he talked about parallelism,
> > Scala, different threads and concurrent updates).
>
> > That's however a completely different question and the word  "immutable"
> > means different things in Scala and Go (and, as Ian said, it might also mean
> > constants in Go. *g*).
>
> > Immutability doesn't implies thread-safety.
>
> > In my post above, I have tried to summarize the differences between
> > Scala's immutable data-structures and Go's immutable string.
>
> > ... or I am missing something.
>
> > I am not sure. The thread (and also the stackoverflow questions) seems to
> > be full of misconceptions :D
>
> > -christoph

Steven Blenkinsop

unread,
Nov 15, 2011, 3:42:55 PM11/15/11
to philip, golang-nuts
On Nov 15, 2011, at 4:09 AM, philip <phili...@gmail.com> wrote:

>
> The problem is:
>
> 1. In the go spec it says the size of a byte is 8 bits, at the bottom
> of the spec.
> http://golang.org/doc/go_spec.html
>
> type size in bytes
> byte, uint8, int8 1
>
> 2. People in this thread said to use a byte buffer when doing string
> appending (in response to my complaint about potential garbage
> collection problems in a loop).
>
> 3. UTF-8 string chars can be between 2 to 4 bytes in length.

Yes. And these 2 to 4 8-bit bytes will be stored as 2 to 4 8-bit bytes in the bytes.Buffer, the same way they're stored as 2 to 4 8-bit bytes in a string, so UTF-8 doesn't care. It's not like there are variable sized elements in the string that you're trying to fit into fixed sized 8-bit containers. Both types have fixed size 8-bit elements and leave it up to the encoding to group them into runes.

PS - Rob Pike wrote the first implementation of UTF-8 along with the encoding's inventor, Ken Thompson ;)

Kyle Lemons

unread,
Nov 15, 2011, 5:43:47 PM11/15/11
to philip, golang-nuts
1. In the go spec it says the size of a byte is 8 bits, at the bottom
of the spec.
http://golang.org/doc/go_spec.html

type                                 size in bytes
byte, uint8, int8                     1

A byte has been defined as 8 bits for quite awhile now.
 
2. People in this thread said to use a byte buffer when doing string
appending (in response to my complaint about potential garbage
collection problems in a loop).

Rightly.  But nowhere did they say that each byte in a []byte or bytes.Buffer was a code point.
 
3. UTF-8 string chars can be between 2 to 4 bytes in length.

Most characters are actually 1 8-bit byte, because (cleverly) the UTF-8 encoding of ASCII is, well, ASCII.  Unicode code points are encoded as 1-4 8-bit bytes in UTF-8.  Unlike UTF-16 and others, UTF-8 is designed specifically to work with individual bytes, and often you don't even need to care that the bytes you're working with are UTF-8 encoded.  And for those times where you do, there are packages in the stdlib waiting for you to use them.  For the sake of future readers, be careful when making claims about things like this without double-checking what you've written; especially for someone coming from a Windows programming background, the notion that characters might be two bytes could be very detrimental to their Go understanding.
~K

Kyle Lemons

unread,
Nov 15, 2011, 5:46:33 PM11/15/11
to philip, golang-nuts
the notion that characters might be two bytes could be very detrimental to their Go understanding.

might *always be two bytes; clearly, it might be two bytes, or one, or three, or four. 
Reply all
Reply to author
Forward
0 new messages