Reasoning behind behavior of range, when index is maintained

242 views
Skip to first unread message

Axel Wagner

unread,
Jul 25, 2017, 6:30:28 PM7/25/17
to golang-nuts
Hey,

someone shared [this question](https://www.reddit.com/r/golang/comments/6paqc0/bug_that_caught_me_with_range/) on reddit. I must say, that I'm surprised by the behavior myself. I would have expected
for i = range v
to be semantically equivalent to
for i = 0; i < len(v); i++
and don't really understand the reasoning behind choosing different semantics. Note, that the difference only exists, if i is declared outside of the loop, that is, this is solely about the behavior after exiting the loop-body.

I'd greatly appreciate some explanation :)

Chris Manghane

unread,
Jul 25, 2017, 6:53:00 PM7/25/17
to Axel Wagner, golang-nuts
This is mentioned directly in the language specification under For statements with range clause:

 For each entry it assigns iteration values to corresponding iteration variables if present and then executes the block.
and 
 For an array, pointer to array, or slice value a, the index iteration values are produced in increasing order, starting at element index 0. If at most one iteration variable is present, the range loop produces iteration values from 0 up to len(a)-1 and does not index into the array or slice itself. For a nil slice, the number of iterations is 0.

This seems logically correct to me, as well. For an array or slice of len(array) = N, the for range statement generates N iteration values, starting from 0. The Nth iteration value would have to be N-1. The difference in semantics is because the post statement in a for loop must be executed after the body is executed. A typical for-loop assigns the N+1st iteration value to the iteration variable, but that is a user's choice.

Hopefully that is more clear,
Chris

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Axel Wagner

unread,
Jul 25, 2017, 7:11:57 PM7/25/17
to Chris Manghane, golang-nuts
On Wed, Jul 26, 2017 at 12:52 AM, Chris Manghane <cm...@google.com> wrote:
This is mentioned directly in the language specification under For statements with range clause:

 For each entry it assigns iteration values to corresponding iteration variables if present and then executes the block.
and 
 For an array, pointer to array, or slice value a, the index iteration values are produced in increasing order, starting at element index 0. If at most one iteration variable is present, the range loop produces iteration values from 0 up to len(a)-1 and does not index into the array or slice itself. For a nil slice, the number of iterations is 0.

This seems logically correct to me, as well. For an array or slice of len(array) = N, the for range statement generates N iteration values, starting from 0. The Nth iteration value would have to be N-1. The difference in semantics is because the post statement in a for loop must be executed after the body is executed. A typical for-loop assigns the N+1st iteration value to the iteration variable, but that is a user's choice.

Hopefully that is more clear,

Not really, sorry :) You basically reiterated the status quo, but did hardly explain why those choices where actually made.

> For an array or slice of len(array) = N, the for range statement generates N iteration values, starting from 0. The Nth iteration value would have to be N-1.

That would also be the case if the loop would be equivalent to the obvious for-loop in regards to indices.

> he difference in semantics is because the post statement in a for loop must be executed after the body is executed.

Well, yeah, the question is, why wasn't this also done for range. Or rather, why wasn't the range of indices assigned decided to be equivalent with the most obvious for-loop.
Especially as it seems, that if you use :=, the generated code actually *does* seem to include the extra increment (bringing the index to len(v), even if unobservably), so pure efficiency does not seem to be the reason.

The reason for the decision might very well be "someone just decided it that way and now that's the way it is". That'd be fine. I'd just be interested to know if there was a more deliberate reasoning behind this.

Chris Manghane

unread,
Jul 25, 2017, 7:37:46 PM7/25/17
to Axel Wagner, golang-nuts
Hmm, well I can't give better reasoning off the top of my head, but I still have to wonder why you expect the behavior of those constructs to be the same, particularly. Imagine, you would like to capture both the iteration variable and the value itself, for example:

var i, v = 0, ""
for i, v = range someSlice {}
println(v)

What is the value of `v`? I expect that the last elements of the iteration of someSlice, namely len(someSlice) - 1 and someSlice[len(someSlice) - 1], would be stored in `i` and `v`, respectively. Alternatively, I might implement this as (or lower it to):

var i, v = 0, ""
for i = 0; i <= len(someSlice); i++ {
    v = someSlice[i]
}

With the alternative implementation, it seems like either the value in `v` would be completely invalid (or cause an out-of-bounds error) or there would be a mismatch between the value that `i` and `v` stop at after iteration. I'm not sure if that's official reasoning, but the semantics of range statements are different in many other situations as well so it seems consistent to me.

Steven Hartland

unread,
Jul 25, 2017, 7:37:48 PM7/25/17
to golan...@googlegroups.com
Its likely this is because the idx in a range is just that, its the index of the processing element and not a loop iteration variable.

As range never processes an element past the end of the slice, there's no way for it to get set to N, hence the difference in semantics from a normal for i := 0; i < N; i++ {} loop which relies on i being set to one past the end to break the loop.

Given this its expected that i in a range would only ever get to N-1.

    Regards
    Steve
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Axel Wagner

unread,
Jul 26, 2017, 2:48:21 AM7/26/17
to Chris Manghane, golang-nuts
This is actually a really good point, thanks :)

Christoph Berger

unread,
Jul 26, 2017, 11:44:46 AM7/26/17
to golang-nuts
Hi Axel,

An attempt to explain this by looking at the C-style loop only:

The classic C-style for loop

for i:=0; i<len(v); i++ {...}

is equivalent to

for i:=0; i<len(v); {
   
// do something with i
    i
++ // This is always the very last statement in the loop body
}


The loop body runs from 0 to len(v)-1 only, because the last increment of i to len(v) stops the loop, and no further iteration occurs. The code in the loop body never sees i being set to len(v). 

And that's the same behavior as with the range operator. 

The code in the Reddit post takes advantage of the fact that the last increment of the C-style loop can be observed outside the loop, for detecting if the loop stopped early. This is a neat side effect that is not possible with the range operator.

Konstantin Khomoutov

unread,
Jul 27, 2017, 3:20:16 AM7/27/17
to golang-nuts
On Wed, Jul 26, 2017 at 08:44:46AM -0700, Christoph Berger wrote:

> > someone shared [this question](
> > https://www.reddit.com/r/golang/comments/6paqc0/bug_that_caught_me_with_range/)
> > on reddit. I must say, that I'm surprised by the behavior myself. I would
> > have expected
> > for i = range v
> > to be semantically equivalent to
> > for i = 0; i < len(v); i++
> > and don't really understand the reasoning behind choosing different
> > semantics. Note, that the difference only exists, if i is declared outside
> > of the loop, that is, this is solely about the behavior after exiting the
> > loop-body.
> >
> > I'd greatly appreciate some explanation :)
> An attempt to explain this by looking at the C-style loop only:
>
> The classic C-style for loop
>
> for i:=0; i<len(v); i++ {...}
>
> is equivalent to
>
> for i:=0; i<len(v); {
> // do something with i
> i++ // This is always the very last statement in the loop body
> }
>
> The loop body runs from 0 to len(v)-1 only, because the last increment of i
> to len(v) stops the loop, and no further iteration occurs. The code in the
> loop body never sees i being set to len(v).
>
> And that's the same behavior as with the range operator.
>
> The code in the Reddit post takes advantage of the fact that the last
> increment of the C-style loop can be observed outside the loop, for
> detecting if the loop stopped early. This is a neat side effect that is not
> possible with the range operator.

I would point out that both Axel and you are off a tiny bit from what
actually happens ;-)

In a for loop which uses a short variable declaration, that variable's
scope is confined to the for *statement* itself, and is also visible in
the loop's body because its scope is defined to be nested in that of the
loop statement. This means in a loop like

for i := 0; i < len(s); i++ {
}

the variable "i" is not accessible after the closing brace.

The actual "problem" stated in that Reddit post is different: it uses a
variable defined outside the "for" loop:

var i int
for i = 0; i < len(v); i++ {
}

As you can see, the loop merely uses that variable; it existed before
the loop and continued to live on after it finished executing.

To recap what others have already written, since the for loop's post
statement is defined to be executed after each execution of the body
(unless it was exited by means of executing `break` or `return`), that

i++

statement gets executed, the condition evaluates to false, and the loop
exits -- with the variable "i" having the value equal to len(v).

One could do

var x int

for i := 0; i < len(v); i, x = i+1, x*2 {
}

and get even more interesting effect on the variable "x" after the loop
finishes executing ;-)

Christoph Berger

unread,
Jul 27, 2017, 8:21:34 AM7/27/17
to Konstantin Khomoutov, golang-nuts
That’s actually what I meant to indicate in the last paragraph (emphasis added by me):

> The code in the Reddit post takes advantage of the fact that the last increment of the C-style loop can be observed outside the loop,

But thanks for providing a clarification. I see now it has not been clear to everyone.

--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/Xi6W3H5mlto/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.

tom....@centralway.com

unread,
Aug 2, 2017, 4:31:25 AM8/2/17
to golang-nuts, kos...@bswap.ru
A side effect of this approach is that the index after the range loop will be zero if slice contains zero or one elements:

This means that code using the index after the range will need to re-test whether the slice was empty to avoid a potential panic.

This message is for the attention of the intended recipient(s) only. It may contain confidential, proprietary and/or legally privileged information. Use, disclosure and/or retransmission of information contained in this email may be prohibited. If you are not an intended recipient, you are kindly asked to notify the sender immediately (by reply e-mail) and to permanently delete this message. Thank you.

Christoph Berger

unread,
Aug 2, 2017, 4:59:12 AM8/2/17
to tom....@centralway.com, golang-nuts, kos...@bswap.ru
Good point. The same is true for the C-style for loop BTW. https://play.golang.org/p/58jwveiywB

Using an explicit boolean flag for signaling success avoids this trap.
Reply all
Reply to author
Forward
0 new messages