range over a []byte(string)

121 views
Skip to first unread message

Brad Fitzpatrick

unread,
Apr 28, 2011, 3:06:24 AM4/28/11
to golang-dev
It seems that with 6g,

   for i, b := range []byte(str) {
     ...
   }

... allocates a new []byte and copies.

Nothing in the language requires that, does it?  'b' for instance is just a local and not aliased to the byte being iterated over, so the compiler could just treat that clause as a non-UTF8 aware byte iteration, without any copies?

(was curious and doing some experiments after noticing the adler32 bounds checks....)

Rob 'Commander' Pike

unread,
Apr 28, 2011, 3:16:01 AM4/28/11
to Brad Fitzpatrick, golang-dev

it depends on the body of the loop

for i, b := range []byte(str) {

b[i]++
}

(you could construct a more complex and useful example) would overwrite the string, but strings are constant.

so with proper analysis, the compiler could avoid the allocation, but not in general.

-rob

Brad Fitzpatrick

unread,
Apr 28, 2011, 3:18:45 AM4/28/11
to Rob 'Commander' Pike, golang-dev
isn't b a just a single byte here, not a []byte?

Rob 'Commander' Pike

unread,
Apr 28, 2011, 3:24:48 AM4/28/11
to Brad Fitzpatrick, golang-dev

you're right. the (pseudo-)copy is unreachable so the allocation is unnecessary, i think. but i've been wrong before.

-rob

Russ Cox

unread,
Apr 28, 2011, 7:50:17 AM4/28/11
to Brad Fitzpatrick, golang-dev
On Thu, Apr 28, 2011 at 03:06, Brad Fitzpatrick <brad...@golang.org> wrote:
> It seems that with 6g,
>    for i, b := range []byte(str) {
>      ...
>    }
> ... allocates a new []byte and copies.
> Nothing in the language requires that, does it?  'b' for instance is just a
> local and not aliased to the byte being iterated over, so the compiler could
> just treat that clause as a non-UTF8 aware byte iteration, without any
> copies?

That's a possible optimization but not one that is done.
It may be worth doing, but it's important to make sure
it happens in both compilers. Otherwise you get very
different performance characteristics in the two worlds.

This happened once before, when I made

*(*T)(unsafe.Pointer(&x))

recognized by gc and not trigger an allocation (it is how
math.Float32bits etc get implemented). That math tests
that used it ran gccgo-compiled binaries out of memory
before Ian added the same logic to gccgo. I'm a little wary
of alloc-removing optimizations after that, but I think they
are in the end unavoidable.

Russ

Reply all
Reply to author
Forward
0 new messages