Slices and inconsistency

168 views
Skip to first unread message

chandr...@gmail.com

unread,
Jun 25, 2020, 8:58:18 PM6/25/20
to golang-nuts
Hi,  I am trying to learn Go (I have been working with C++ for a while). I see inconsistency of slices and append


func main() {

// append example within capacity
var m []int = []int{1, 2, 3}
a := m[0:2]
b := append(a, 4)
a[0] = -1

fmt.Printf("%v, %d, %d\n", m, len(m), cap(m))
fmt.Printf("%v, %d, %d\n", a, len(a), cap(a))
fmt.Printf("%v, %d, %d\n", b, len(b), cap(b))

// append example with more than capacity
var m1 []int = []int{1, 2, 3}
a1 := m1[0:2]
b1 := append(a1, 4, 5)
a1[0] = -1

fmt.Printf("%v, %d, %d\n", m1, len(m1), cap(m1))
fmt.Printf("%v, %d, %d\n", a1, len(a1), cap(a1))
fmt.Printf("%v, %d, %d\n", b1, len(b1), cap(b1))

}

output is
--------
[-1 2 4], 3, 3 [-1 2], 2, 3 [-1 2 4], 3, 3

[-1 2 3], 3, 3 [-1 2], 2, 3 [1 2 4 5], 4, 6


Essentially based on the existing capacity, the assignment of one slice effects other slices. These are stemming from the underlying pointer arithmetic and seems inconsistent. Looks like programmer needs to know the history of capacity before understanding the ramifications of slice assignments.

Excuse me if this is basic question. Thought to ask..

Regards
Ck


Ian Lance Taylor

unread,
Jun 25, 2020, 9:03:16 PM6/25/20
to chandr...@gmail.com, golang-nuts
Please see https://blog.golang.org/slices.

Ian

David Riley

unread,
Jun 26, 2020, 9:23:08 AM6/26/20
to chandr...@gmail.com, golang-nuts
On Jun 25, 2020, at 8:49 PM, chandr...@gmail.com wrote:
>
> Essentially based on the existing capacity, the assignment of one slice effects other slices. These are stemming from the underlying pointer arithmetic and seems inconsistent. Looks like programmer needs to know the history of capacity before understanding the ramifications of slice assignments.

You are correct, the programmer needs to read the manual. Slices are "windows" into their underlying array, therefore assigning data into slices tied to a given array changes the data in other arrays as well. The append() function merely extends the slice by n and assigns into those elements, returning the new slice header (the old one can remain unchanged, but they will both be slices of the same array).

Ian pointed to the canonical documentation on this: https://blog.golang.org/slices

Additionally, the language spec has this: https://golang.org/ref/spec#Appending_and_copying_slices

I do agree that it would be nicer if this were flagged a little more loudly, since the semantics of the append() builtin can lead the early developer to believe they are making a copy of the slice in question, but they aren't.


- Dave


Robert Engels

unread,
Jun 26, 2020, 9:28:37 AM6/26/20
to David Riley, chandr...@gmail.com, golang-nuts
That is not quite true. Once is it extended beyond the current capacity you will have a new backing array and the slices will diverge.

> On Jun 26, 2020, at 8:23 AM, David Riley <frave...@gmail.com> wrote:
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CA9014AE-527F-4D18-81A9-6E63FAD97B54%40gmail.com.

David Riley

unread,
Jun 26, 2020, 11:53:11 AM6/26/20
to Robert Engels, chandr...@gmail.com, golang-nuts
> On Jun 26, 2020, at 9:28 AM, Robert Engels <ren...@ix.netcom.com> wrote:
>
>> On Jun 26, 2020, at 8:23 AM, David Riley <frave...@gmail.com> wrote:
>>
>> You are correct, the programmer needs to read the manual. Slices are "windows" into their underlying array, therefore assigning data into slices tied to a given array changes the data in other arrays as well. The append() function merely extends the slice by n and assigns into those elements, returning the new slice header (the old one can remain unchanged, but they will both be slices of the same array).
>>
>> Ian pointed to the canonical documentation on this: https://blog.golang.org/slices
>>
>> Additionally, the language spec has this: https://golang.org/ref/spec#Appending_and_copying_slices
>>
>> I do agree that it would be nicer if this were flagged a little more loudly, since the semantics of the append() builtin can lead the early developer to believe they are making a copy of the slice in question, but they aren't.
>>
> That is not quite true. Once is it extended beyond the current capacity you will have a new backing array and the slices will diverge.

Agreed, but the problem here is that unless you are checking the capacity beforehand, you cannot count on that behavior. The general point is that if you want a modified copy of an array/slice, you need to be explicit about it or you will run into unpredictable and unpleasant to debug problems, but that is not immediately apparent given the superficial semantics of the append() builtin unless you understand slices at a deeper level.

As in all things, the proper fix is to RTFM, but I think TFM could afford to be a *tiny* bit more explicit about this because it's surprisingly easy to miss.


- Dave



howar...@gmail.com

unread,
Jun 26, 2020, 1:28:10 PM6/26/20
to golang-nuts
"If the capacity of s is not large enough to fit the additional values, append allocates a new, sufficiently large underlying array that fits both the existing slice elements and the additional values. Otherwise, append re-uses the underlying array."

That seems pretty explicit to me? That's from the spec, the other link above, the blog, is even more explicit, by straight up showing the implementation before introducing the built-in.

The tour also says it: https://tour.golang.org/moretypes/15
"If the backing array of s is too small to fit all the given values a bigger array will be allocated. The returned slice will point to the newly allocated array."

It also links to this documentation: https://blog.golang.org/slices-intro
"The append function appends the elements x to the end of the slice s, and grows the slice if a greater capacity is needed." - this is the only one that seems less than explicit about the semantics, and in the sense that it does not make explicit that 'grows the slice' means 'returns a copy of the slice with greater capacity.'


The latter link then links further to this documentation: https://golang.org/doc/effective_go.html#slices
and just like the blog, it starts by introducing the implementation, where it is clear that only one path includes a copy function, and it explicitly says "If the data exceeds the capacity, the slice is reallocated." A bit later where it covers append directly, it says "The result needs to be returned because, as with our hand-written Append, the underlying array may change."

4 bits of official documentation, all of which mention the 'gotcha' more or less directly. And in most of those cases, discussion of append directly follows discussion of 'copy', which should make it clear how to get copies of slices.

I think maybe where it gets missed is from people learning the language by example, by coping code from StackOverflow, reading code in GitHub, etc. Go is pretty surprisingly easy to pick up by example like that, but it does mean those corner cases can bite - however, the actual documentation is pretty clear and up-front about the gotchas.

Robert Engels

unread,
Jun 26, 2020, 1:42:15 PM6/26/20
to howar...@gmail.com, golang-nuts
I don’t think “how it works” is the confusion, more of “how to use it properly”

My opinion is that if RTFM is required more than once for a core concept there may be a design problem. It clearly bites a lot of people. Slices are a higher level struct, as the underlying array is the same as Java, but Java doesn’t suffer these knowledge gaps. I’m guessing because it’s higher but not high enough. Done for performance while forsaking some safety and clarity.  

On Jun 26, 2020, at 12:28 PM, howar...@gmail.com wrote:


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

David Riley

unread,
Jun 26, 2020, 1:50:38 PM6/26/20
to Robert Engels, howar...@gmail.com, golang-nuts
On Jun 26, 2020, at 1:41 PM, Robert Engels <ren...@ix.netcom.com> wrote:
>
> I don’t think “how it works” is the confusion, more of “how to use it properly”
>
> My opinion is that if RTFM is required more than once for a core concept there may be a design problem. It clearly bites a lot of people. Slices are a higher level struct, as the underlying array is the same as Java, but Java doesn’t suffer these knowledge gaps. I’m guessing because it’s higher but not high enough. Done for performance while forsaking some safety and clarity.

Frankly, I think the problem is with append(). I understand why it's structured the way it is, and yes, I agree the documentation is fairly clear about how it works *if you read it*, but we need to be honest with ourselves about how people read the documentation (or don't). The problem with append() is that it returns a value, and if you only learn by example or quickly skim through the docs (as Howard pointed out), it's not going to be immediately apparently that it's not necessarily a new slice or array.

I don't say this because I think append() should be modified to be an in-place operator (that would be impractical and break a lot of things that rely on current behavior) but it shouldn't be mysterious to us that people coming from every language that has a variable-length vector think of it incorrectly because that's how the vast majority of them treat an append operation (either as an in-place or as a copy, and we've kind of split the difference).

I wish I had a more constructive answer to this, because I guess you can't make other people's tutorials call this out, and using the return value as a different slice is a valid thing to do even if it is inadvisable without doing more elaborate checking on the capacity, etc, so it's kind of hard to put in a linter.


- Dave

David Riley

unread,
Jun 26, 2020, 1:53:06 PM6/26/20
to Robert Engels, howar...@gmail.com, golang-nuts
On Jun 26, 2020, at 1:41 PM, Robert Engels <ren...@ix.netcom.com> wrote:
>
> My opinion is that if RTFM is required more than once for a core concept there may be a design problem. It clearly bites a lot of people. Slices are a higher level struct, as the underlying array is the same as Java, but Java doesn’t suffer these knowledge gaps. I’m guessing because it’s higher but not high enough. Done for performance while forsaking some safety and clarity.

Also, to this specific point: this exact approach, as with much of Go, embodies the Bell Labs approach to design (for better or for worse, and with good reason). Sometimes we have to live with the artifacts of evolution.

https://www.jwz.org/doc/worse-is-better.html


- Dave


Tyler Compton

unread,
Jun 26, 2020, 2:53:02 PM6/26/20
to David Riley, Robert Engels, howar...@gmail.com, golang-nuts
On Fri, Jun 26, 2020 at 10:52 AM David Riley <frave...@gmail.com> wrote:
Also, to this specific point: this exact approach, as with much of Go, embodies the Bell Labs approach to design (for better or for worse, and with good reason).  Sometimes we have to live with the artifacts of evolution.

One interesting counterexample here is the GC and scheduler, which take on a huge amount of complexity in the implementation to create a dead-simple interface. It seems like Go is willing to take a worse-is-better approach when the amount of interface complexity is relatively small.

Ian Lance Taylor

unread,
Jun 26, 2020, 2:59:55 PM6/26/20
to Tyler Compton, David Riley, Robert Engels, Howard C. Shaw III, golang-nuts
I honestly don't think that append is an example of worse-is-better.
I think it's an example of preferring to pass and return values rather
than pointers. The language could easily have made append take a
pointer to a slice as its first argument, which would eliminate a
certain class of bugs. But it would also mean that you couldn't
create a slice by simply appending to nil, as in s :=
append([]int(nil), 1, 2, 3). You would instead have to declare the
variable first.

While append does confuse people from time to time, I think it's
clearly documented, as pointed out upthread. It's not really all that
hard to understand. And once you understand it, it's easy to use.

I don't think any particular approach is obviously better here, so I
don't think this is an example of worse-is-better.

Ian

David Riley

unread,
Jun 26, 2020, 7:48:38 PM6/26/20
to Tyler Compton, Robert Engels, howar...@gmail.com, golang-nuts
Agreed, and this is a good point! Similar for even early Unix and the memory allocator. Some things you have to make a simple interface for or no one will use them.


- Dave

David Riley

unread,
Jun 26, 2020, 8:03:34 PM6/26/20
to Ian Lance Taylor, Tyler Compton, Robert Engels, Howard C. Shaw III, golang-nuts
On Jun 26, 2020, at 2:59 PM, Ian Lance Taylor <ia...@golang.org> wrote:
>
> I honestly don't think that append is an example of worse-is-better.
> I think it's an example of preferring to pass and return values rather
> than pointers. The language could easily have made append take a
> pointer to a slice as its first argument, which would eliminate a
> certain class of bugs. But it would also mean that you couldn't
> create a slice by simply appending to nil, as in s :=
> append([]int(nil), 1, 2, 3). You would instead have to declare the
> variable first.

I should clarify that I don't consider it a dig; I think it's a practical solution to the problem even if it has its warts, because it is both performant and flexible. Other languages choose either in-place operations (e.g. Java) or copies (e.g. Erlang), but you can do a lot more with what's offered in Go. It just takes some getting used to.

> While append does confuse people from time to time, I think it's
> clearly documented, as pointed out upthread. It's not really all that
> hard to understand. And once you understand it, it's easy to use.

The "once you understand it" part is really the issue here, because it's not immediately obvious if you're not reading the manual.

> I don't think any particular approach is obviously better here, so I
> don't think this is an example of worse-is-better.

The canonical example in Worse Is Better is Unix's solution to the problem where you get an interrupt in the middle of a syscall; Unix just said "hey, this failed, do it again" instead of going to great lengths to try to resume the operation, which was the pragmatic choice even if it led to a somewhat quirky interface.

"Worse Is Better" is often characterized more by quirkiness rather than outright badness, and Richard Gabriel (LISP purist that he was) was pretty upfront about being cheekily hyperbolic about the properties that Unix and C had that made them perhaps occasionally inelegant but ultimately more effective and survivable. I think a lot of aspects of Go embody that particular spirit, again for good reason considering its progenitors.


- Dave

Reply all
Reply to author
Forward
0 new messages