Why does this struct escape to the heap?

253 views
Skip to first unread message

Paul D

unread,
Sep 4, 2018, 5:46:09 PM9/4/18
to golang-nuts
I'm trying to reduce allocations (and improve performance) in some Go code. There's a recurring pattern in the code where a struct is passed to a function, and the function passes one of the struct's methods to strings.IndexFunc. For some reason, this causes the entire struct to escape to the heap. If I wrap the method call in an anonymous function, the struct does not escape and the benchmarks run about 30% faster.

Here is a minimal example. In the actual code, the struct has more fields/methods and the function in question actually does something. But this sample code illustrates the problem. Why does the opts argument escape to the heap in index1 but not in the functionally equivalent index2? And is there a robust way to ensure that it stays on the stack?


type options
struct {
    zero rune
}

func
(opts *options) isDigit(r rune) bool {
    r
-= opts.zero
   
return r >= 0 && r <= 9
}

// opts escapes to heap
func index1
(s string, opts options) int {
   
return strings.IndexFunc(s, opts.isDigit)
}

// opts does not escape to heap
func index2
(s string, opts options) int {
    return strings.IndexFunc(s, func(r rune) bool {
       
return opts.isDigit(r)
    })
}


FYI I'm running Go 1.10.3 on Linux. Thanks...


silviu...@gmail.com

unread,
Sep 5, 2018, 12:05:39 AM9/5/18
to golang-nuts
Hi Paul,


Basically, in your index1, the opts.isDigit passed to IndexFunc is syntactic sugar for (&opts).isDigit -> then the compiler needs to move opts on the heap. 

You can either 
a) define an alternate isDigitVal method, e.g. func (opts options) isDigitMethodByVal(r rune) bool
or 
b) you can just pass the reference to an original *options from the get go:

func index1Pointer(s string, optsPointer *options) int {
return strings.IndexFunc(s, optsPointer.isDigit)
}

Not sure which one will be faster - you'll need to benchmark. Probably both will be about the same, since that single *options allocation tends to zero.

For your index2 function question, I think it's because Go maintains the variables from a parent function that are referred in an enclosed function (closure) for as long as both are alive. 
I'm guessing that opts.IsDigit(r) is alive and kicking till it returns. 

cheers,
silviu

roger peppe

unread,
Sep 5, 2018, 4:59:07 AM9/5/18
to Paul D, golang-nuts
I think the escape analysis is at fault here. The two index functions should
have the same characteristics. Both opts.isDigit and the explicit closure
capture the pointer to opts, and I don't see why the compiler shouldn't
be able to detect that the former doesn't escape just as it does the latter.

Please raise an issue at golang.org/issue.
This seems like
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

paul...@gmail.com

unread,
Sep 5, 2018, 8:03:12 AM9/5/18
to golang-nuts
Hi Silviu,

Thanks for your reply. I'm not sure about the points you raise though.

Basically, in your index1, the opts.isDigit passed to IndexFunc is syntactic sugar for (&opts).isDigit -> then the compiler needs to move opts on the heap.

Why? Taking a reference to a function argument shouldn't automatically move it to the heap. As long as the function doesn't store or return the reference, it doesn't exist beyond the scope of that function. Escape analysis should be able to figure that out and leave the value on the stack. My real code passes &opts to several functions without issue. The only ones that cause a problem are the IndexFunc calls.

For your index2 function question, I think it's because Go maintains the variables from a parent function that are referred in an enclosed function (closure) for as long as both are alive. 
I'm guessing that opts.IsDigit(r) is alive and kicking till it returns.

Sorry, I'm not sure what you mean by this. Could you expand on it?

Thanks,
Paul

paul...@gmail.com

unread,
Sep 5, 2018, 8:15:09 AM9/5/18
to golang-nuts

FYI, here is the escape analysis output.


 $ go build
-gcflags="-l -m=2"

./escape.go:20:10: <S> capturing by ref: opts (addr=true assign=false width=4)
./escape.go:9:38: (*options).isDigit opts does not escape
./escape.go:15:34: opts escapes to heap
./escape.go:15:34:      from opts.isDigit (call part) at ./escape.go:15:34
./escape.go:14:37: moved to heap: opts
./escape.go:14:37: index1 s does not escape
./escape.go:15:34: index1 opts.isDigit does not escape
./escape.go:18:37: index2 s does not escape
./escape.go:19:30: index2 func literal does not escape
./escape.go:20:14: index2.func1 opts does not escape


Line 15 is the call to IndexFunc from the function index1. I think "call part" is supposed to be the reason that opts escapes to the heap but I couldn't find any information about what that means.

Silviu Capota Mera

unread,
Sep 5, 2018, 9:19:52 AM9/5/18
to paul...@gmail.com, golang-nuts
Hi Paul,

Perhaps my answer was a bit simplistic, and I admit that I answered without looking too much into it. 

Basically, in your index1, the opts.isDigit passed to IndexFunc is syntactic sugar for (&opts).isDigit -> then the compiler needs to move opts on the heap.

"Why? Taking a reference to a function argument shouldn't automatically move it to the heap. As long as the function doesn't store or return the reference, it doesn't exist beyond the scope of that function. Escape analysis should be able to figure that out and leave the value on the stack. My real code passes &opts to several functions without issue. The only ones that cause a problem are the IndexFunc calls."

I agree with you re: the escape analysis ought to able to detect that scenario - my initial reasoning here was that the compiler could be taking the safest approach for the following scenario: 
- the (&opts).isDigit function is passed to IndexFunc (or any other function). 
- that IndexFunc function may or may not diverge into concurrent execution (using go ... ) 
- while those detached execution flows are still running, the opts variable runs the risk of being popped out of its stack existence when index1 returns, so (&opts).isDigit would not be addressable from the other execution threads.

That was my reasoning at the time I replied. But in the meanwhile, out of curiosity, I took IndexFunc and its private sister function, indexFunc out of the strings package, and added a convoluted go routine inside that private indexFunc, that takes the function as a parameter, and sleeps for a number of seconds. In the meanwhile the main execution flow returns. From what I'm seeing, from that moment on, index2 -> opts also (correctly) escapes to heap.

This leaves me with agreeing with you and Roger that there's room for improvement. 

For your index2 function question, I think it's because Go maintains the variables from a parent function that are referred in an enclosed function (closure) for as long as both are alive. 
I'm guessing that opts.IsDigit(r) is alive and kicking till it returns.

"Sorry, I'm not sure what you mean by this. Could you expand on it?"

What I meant is that the closure, that wrapper func you created inside index2 is aware of the surrounding scope, and stateful. It will have access to &opts for as long as it's alive. 


cheers
silviu



--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/x9ZV-_XVaUk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.

paul...@gmail.com

unread,
Sep 5, 2018, 1:06:24 PM9/5/18
to golang-nuts
I wonder if this is to do with method values. According to the spec, when you declare a method value like x.M:

The expression x is evaluated and saved during the evaluation of the method value; the saved copy is then used as the receiver in any calls, which may be executed later.

So using the method value opts.isDigit in index1 does in fact result in &opts being copied. Maybe this causes opts to escape to the heap (although I don't know why the copy would need to live beyond the scope of index1). This would also explain why opts does not escape in index2 where opts.isDigit() is just a normal method call.

I tested this theory with two new functions (neither of which call IndexFunc -- that doesn't seem to be part of the problem). One function calls the isDigit method directly and the other uses a method value. They're functionally equivalent but opts only escapes in the second function.


// isDigit called directly: opts does not escape to heap
func isDigit1
(r rune, opts options) bool {
   
return opts.isDigit(r)
}

// isDigit called via method value: opts escapes to heap
func isDigit2
(r rune, opts options) bool {
    f
:= opts.isDigit
    return f(r)
}


Does anyone have any insight/views on a) whether this is really what's happening and b) whether this is the desired behaviour? I don't see why using method values in this way should cause a heap allocation but perhaps there's a reason for it.

Tristan Colgate

unread,
Sep 6, 2018, 11:33:17 AM9/6/18
to paul...@gmail.com, golang-nuts
  I think this has to do with the pointer reciever, vs the pass by value:

func noEscape(r rune, opts *options) bool {
 f := opts.isDigit
 return f(r)
}

opts here does not escape, but in:

func escapes(r rune, opts options) bool {
 f := opts.isDigit
 return f(r)
}

opts is copied, so it is the copy of opts that the compiler believes escapes. Perhaps this is because opts could be used by a defer (there is none though, the compiler could/should notice that).

In the following, opts2 even escapes and gets heap allocated.

func escapes(r rune, opts *options) bool {
  var res bool
  {
    opts2 := *opts                                                                                                                                 
    f := opts2.isDigit
    res = f(r)
  }
  return res 
}

Did you open an issue? I'm curious if there is a reason the escape analysis can't pick this up.


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

paul...@gmail.com

unread,
Sep 6, 2018, 12:26:31 PM9/6/18
to golang-nuts
Using a pointer receiver (as in your noEscape example) just pushes the problem up the stack. When you try to call it, e.g.


func parent
() bool {
   
var opts options
   
return noEscape('0', &opts)
}


you find that &opts escapes to the heap in the parent function instead.

I haven't opened an issue yet (I was hoping to get confirmation that it was a bug first) but will do so today unless someone posts a definitive answer here.

Thanks...

Tristan Colgate

unread,
Sep 6, 2018, 12:39:41 PM9/6/18
to paul...@gmail.com, golang-nuts

paul...@gmail.com

unread,
Sep 6, 2018, 1:03:56 PM9/6/18
to golang-nuts

Yes, I came across that post when looking for info on method values and allocations. I'm sure that's the root of the problem. But I don't understand why the compiler can't figure out that no heap allocation is needed. If t doesn't escape when calling t.M(), why can't the compiler work out that t.M shouldn't cause t to escape either?

Tristan Colgate

unread,
Sep 6, 2018, 2:58:19 PM9/6/18
to paul...@gmail.com, golang-nuts
In hopes of giving the compiler the best possible change I've tried:

type options struct {
}

func (opts *options) isDigit(r rune) bool {
        r -= '0'
        return r >= 0 && r <= 9
}

func escapes(r rune, opts options) bool {
        return (*options).isDigit(&opts, r)
}

opts still escapes, and I've tested ever version of go since 1.7, and it always has.
I suspect this is known and correct behaviour, but I've no idea why.
Reply all
Reply to author
Forward
0 new messages