I've got the following message from gc:
"closure needs too many variables; runtime will reject it"
As I understand, the nested function in my code uses too many
variables from
the enclosing context.
As I came to know, this compiler message is generated in $GOROOT/src/
cmd/gc/closure.c.
There's the following code fragment in walkclosure function:
if(narg*widthptr > 100)
yyerror("closure needs too many variables; runtime will reject it");
Assuming that widthptr on AMD64 cpu is 8 bytes, we have 100/8 = 12
possible variables,
saved in a closure.
Who may explain to me, why this restriction is needed?
Is it possible to increase the number of variables?
That's correct.
> As I came to know, this compiler message is generated in $GOROOT/src/
> cmd/gc/closure.c.
> There's the following code fragment in walkclosure function:
>
> if(narg*widthptr > 100)
> yyerror("closure needs too many variables; runtime will reject it");
>
> Assuming that widthptr on AMD64 cpu is 8 bytes, we have 100/8 = 12
> possible variables,
> saved in a closure.
>
> Who may explain to me, why this restriction is needed?
> Is it possible to increase the number of variables?
The restriction is there because the generated code for a
closure does not include stack overflow checks, so it can
only assume 128 bytes of local stack frame, and each
closed variable turns into a word on the stack during the
closure trampoline.
The fix probably not to generate the stack overflow check
(it varies from system to system and would be fragile to
put a copy of that logic in a second place) but to pass in
a single pointer to the closed variables and make access
to such a variable a double-indirect instead of the current
single-indirect.
Russ
I supposed that every closure simply has a pointer to the frame of
enclosing function. And all frames are allocated in heap, so when
function exits, its frame is still accessable.
As I see now, when closure is instatiated, pointers to the variables
of enclosing function that are needed, are copied inside the closure.
This solution seems very strange to me. There are a lot of languages
with nested functions support. Pascal, for example. And, as I know,
nested functions in Pascal are implemented the way I supposed. Of
course, a kind of 'double-indirect' must be used in Go too.
It may have, though, side effect of large frames of exited functions,
staying in heap, while only some variables in them are referenced by
closures. But, I think, it's not a real problem, if the implementation
of closures will be carefully described in language specification.
Sergei.
Pascal did not allow closures to refer to outer variables
after the function that declared them returned. Go does,
so any closed-over variables are heap allocated, not part
of the stack frame. The various arguments are the
addresses of the closed-over variables.
Russ
NOTE: I put 'double-indirect' and 'single-indirect' in quotes, because
really 'double-indirect' is a 'single-indirect' with adding a constant
offset to the pointer.
I think, in this case all variables of a function, that are accessable
in nested functions, must be allocated in heap in one 'chunk'. So,
every nested function must have a pointer to this 'chunk' (as well as
a pointer to the 'chunk' of function, in which the enclosing function
is nested, etc.)
P.S. As I see, every function accesses its own variables, that are
used in functions, nested in it, by pointers. Am I right? If I am
right, the next question arises: is it the right thing?
> I think, in this case all variables of a function, that are accessable
> in nested functions, must be allocated in heap in one 'chunk'. So,
> every nested function must have a pointer to this 'chunk' (as well as
> a pointer to the 'chunk' of function, in which the enclosing function
> is nested, etc.)
It can be done that way. The way gccgo works is that each nested
function builds a heap allocated struct (a composite literal) which
holds the addresses of all the variables that it references in
enclosing functions. That gives a single pointer which is then passed
to the nested function.
> P.S. As I see, every function accesses its own variables, that are
> used in functions, nested in it, by pointers. Am I right?
Yes.
> If I am
> right, the next question arises: is it the right thing?
It is a right thing. Other approaches are possible. This one
minimizes the amount of heap allocated storage while maintaining
flexibility for multiple nested functions.
Ian