Function with huge stack frame

520 views
Skip to first unread message

Clifford Wolf

unread,
Feb 13, 2015, 7:24:49 AM2/13/15
to emscripte...@googlegroups.com
Hello,

I'm getting a "RangeError: Maximum call stack size exceeded" exception in my recursive algorithm. The problem is not deep recursion: this happens at 11 recursion levels of my function plus another 10-20 levels of function calls above that from the main program. The problem is that my function has a huge stack frame, and I don't know why it has it. I am using the following debug code at the top of my function to monitor what is happening:

EM_ASM_({
var i = 0;
function stackExplorer() { i++; stackExplorer(); }
try { stackExplorer(); } catch (e) { console.log("--> revursion level: " + $0 + ", free stack: " + i); }
}, recursion_counter);

The output I get when running in node.js is this:

--> revursion level: 8, free stack: 5876
--> revursion level: 9, free stack: 4112
--> revursion level: 10, free stack: 2347
--> revursion level: 11, free stack: 583

/home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:84
      throw ex;
            ^
RangeError: Maximum call stack size exceeded

So the stack frame of my function is about 1765 times larger than the stack frame of the stackExplorer() function. In firefox I have the same problem (The stack frame is even slightly larger with firefox). The stack trace printed by firefox (see end of this mail) also shows that there is not a large stack of intermediate function calls in the call stack using up the space on the call stack. It is really just my function calling itself (the two intermediate functions dynCall_* and invoke_* are added by emscripten).

I am willing to rewrite my code to reduce the size of the stack frame. But I don't understand how the call stack size is affected by my C++ code. (I assume that local objects of my C/C++ functions end up on the javascript heap.) Can I somehow profile where that large stack frame size comes from?

Here is the code of my AstNode::simplify() function:

(Beware: This function is not pretty, and it is over 2000 lines long. One day I have to refactor it but it would be nice to understand this issue before I do that.)

jfyi: The branch https://github.com/cliffordwolf/yosys/tree/emcc-debug can be used as a test case. It already contains a Makefile.conf with the correct build settings and an alternative main() that will trigger the error. With "emcc" in your path (i.e. after ". emsdk_set_env.sh") one only needs to run "make" to produce "yosys.js", and "node yosys.js" produce the above error.

Thanks in advance for your help and insight and of course many thanks for making emcc in the first place!

regards,
 - clifford


--- Output incl. stack trace from firefox ---

"--> revursion level: 4, free stack: 17300" yosys.js line 301 > eval:1
"--> revursion level: 5, free stack: 14260" yosys.js line 301 > eval:1
"--> revursion level: 6, free stack: 11220" yosys.js line 301 > eval:1
"--> revursion level: 7, free stack: 8178" yosys.js line 301 > eval:1
"--> revursion level: 8, free stack: 5138" yosys.js line 301 > eval:1
"--> revursion level: 9, free stack: 2096" yosys.js line 301 > eval:1
"exception thrown: InternalError: too much recursion,__ZN5Yosys3AST7AstNode8simplifyEbbbiibb@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:220682:2
dynCall_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225101:12
invoke_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7980:12
__ZN5Yosys3AST7AstNode8simplifyEbbbiibb@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:223678:20
dynCall_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225101:12
invoke_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7980:12
__ZN5Yosys3AST7AstNode8simplifyEbbbiibb@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:223678:20
dynCall_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225101:12
invoke_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7980:12
__ZN5Yosys3AST7AstNode8simplifyEbbbiibb@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:224281:19
dynCall_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225101:12
invoke_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7980:12
__ZN5Yosys3AST7AstNode8simplifyEbbbiibb@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:223678:20
dynCall_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225101:12
invoke_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7980:12
__ZN5Yosys3AST7AstNode8simplifyEbbbiibb@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:223555:17
dynCall_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225101:12
invoke_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7980:12
__ZN5Yosys3AST7AstNode8simplifyEbbbiibb@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:223678:20
dynCall_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225101:12
invoke_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7980:12
__ZN5Yosys3AST7AstNode8simplifyEbbbiibb@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:223294:15
dynCall_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225101:12
invoke_iiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7980:12
__ZN5Yosys3AST7AstNode8simplifyEbbbiibb@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:223158:19
__ZN5Yosys3AST7AstNode8simplifyEbbbiibb@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:221655:12
__ZN5YosysL14process_moduleEPNS_3AST7AstNodeEb@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:219163:13
dynCall_iii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225045:12
invoke_iii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7908:12
__ZN5Yosys3AST7processEPNS_5RTLIL6DesignEPNS0_7AstNodeEbbbbbbbbbbbb@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:216459:16
dynCall_viiiiiiiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225010:5
invoke_viiiiiiiiiiiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7863:5
__ZN5Yosys15VerilogFrontend7executeERPNSt3__113basic_istreamIcNS1_11char_traitsIcEEEENS1_12basic_stringIcS4_NS1_9allocatorIcEEEENS1_6vectorISB_NS9_ISB_EEEEPNS_5RTLIL6DesignE@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:325020:15
dynCall_viiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1224919:5
invoke_viiiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7746:5
__ZN5Yosys8Frontend13frontend_callEPNS_5RTLIL6DesignEPNSt3__113basic_istreamIcNS4_11char_traitsIcEEEENS4_12basic_stringIcS7_NS4_9allocatorIcEEEENS4_6vectorISD_NSB_ISD_EEEE@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:15220:6
dynCall_viiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225115:5
invoke_viiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7998:5
__ZN5Yosys8Frontend13frontend_callEPNS_5RTLIL6DesignEPNSt3__113basic_istreamIcNS4_11char_traitsIcEEEENS4_12basic_stringIcS7_NS4_9allocatorIcEEEESD_@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:14961:6
dynCall_viiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225115:5
invoke_viiii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7998:5
__ZN12_GLOBAL__N_111TechmapPass7executeENSt3__16vectorINS1_12basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEENS6_IS8_EEEEPN5Yosys5RTLIL6DesignE@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:929036:16
dynCall_viii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225087:5
invoke_viii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7962:5
__ZN5Yosys4Pass4callEPNS_5RTLIL6DesignENSt3__16vectorINS4_12basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEENS9_ISB_EEEE@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:13058:2
dynCall_vii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1224933:5
invoke_vii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7764:5
__ZN5Yosys4Pass4callEPNS_5RTLIL6DesignENSt3__112basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEE@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:12645:8
dynCall_vii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1224933:5
invoke_vii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7764:5
__ZN12_GLOBAL__N_112TestCellPass7executeENSt3__16vectorINS1_12basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEENS6_IS8_EEEEPN5Yosys5RTLIL6DesignE@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1050408:86
dynCall_viii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1225087:5
invoke_viii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7962:5
__ZN5Yosys4Pass4callEPNS_5RTLIL6DesignENSt3__16vectorINS4_12basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEENS9_ISB_EEEE@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:13058:2
dynCall_vii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1224933:5
invoke_vii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7764:5
__ZN5Yosys4Pass4callEPNS_5RTLIL6DesignENSt3__112basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEE@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:12880:6
dynCall_vii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1224933:5
invoke_vii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7764:5
__ZN5Yosys8run_passENSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEEPNS_5RTLIL6DesignE@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:125815:2
dynCall_vii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1224933:5
invoke_vii@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:7764:5
__Z5main_iPPc@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:9438:11
_main@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:10514:3
asm._main@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1228824:8
callMain@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1231384:15
doRun@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1231442:42
run/<@file:///home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:1231453:7
" yosys.html:1245

Jukka Jylänki

unread,
Feb 13, 2015, 12:21:48 PM2/13/15
to emscripte...@googlegroups.com
The numbers for the available stack size that you print out look very odd to me. At recursion level 4, it is stating that it has 17300 bytes(?) of stack space left. Assuming that you have not changed the default Emscripten stack size during compilation, which is 5MB (see here: https://github.com/kripken/emscripten/blob/master/src/settings.js#L54 ), then having only 17300 bytes of that 5MB left sounds like the huge majority of the stack is already gone. The decrease of the stack by ~3KB per level of recursion looks very normal for a large function like that.

Therefore I suspect that the stack is swallowed by something already earlier than the recursive function. Try backing up the callstack and execution and print out the stack sizes to try to find a location where you still have ~5MB'ish of that stack space left available, and see where the majority of that is killed.

Most often that occurs if you have a function with a large local array, something like:

void foo()
{
   int localTempArray[1024*1024]; // Consumes 4MB of stack space.
   // ...
}

Another way to consume the stack space can occur if one uses the JS-facing Runtime.stackAlloc function to allocate a very large stack size for JS<->C interop. However if you are not doing custom JS interop or calling that function manually, then I don't think that is the case here.

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Clifford Wolf

unread,
Feb 13, 2015, 1:34:52 PM2/13/15
to emscripte...@googlegroups.com
Thanks for your reply,


On Friday, February 13, 2015 at 6:21:48 PM UTC+1, jj wrote:
The numbers for the available stack size that you print out look very odd to me. At recursion level 4, it is stating that it has 17300 bytes(?) of stack space left.

Not bytes. Recursion levels until "Maximum call stack size exceeded" for a trivial javascript function. I posted the code, here is it again:

EM_ASM_({
var i = 0;
function stackExplorer() { i++; stackExplorer(); }
try { stackExplorer(); } catch (e) { console.log("--> recursion level: " + $0 + ", free stack: " + i); }
}, recursion_counter);

Each recursion level of stackExplorer() increases the variable i until the call stack is exceeded and a "Maximum call stack size exceeded" exception is thrown. This exception is caught and "i" is printed. I don't know of any method to measure the available or used size on the stack in bytes. But I can do it this way.

I have added another call like that to my main function:

--> main, free stack: 20946
--> recursion level: 1, free stack: 18221
--> recursion level: 2, free stack: 16463
--> recursion level: 3, free stack: 14700
--> recursion level: 3, free stack: 14700
--> recursion level: 2, free stack: 16463
--> recursion level: 3, free stack: 14700
....
--> recursion level: 8, free stack: 5876
--> recursion level: 9, free stack: 4112
--> recursion level: 9, free stack: 4112
--> recursion level: 8, free stack: 5876
--> recursion level: 9, free stack: 4112
--> recursion level: 10, free stack: 2347
--> recursion level: 11, free stack: 583

/home/clifford/Work/handicraft/2014/verilearn/yosys/yosys.js:84
      throw ex;
            ^
RangeError: Maximum call stack size exceeded

So all the stuff in the stack trace I posted above the first call to AstNode::simplify() takes up as much space on the call stack as 2725 recursions of stackExplorer(); lets call it Stack Explorer Units (SEU). I count 13 C functions in the JavaScript stack trace I posted. So that's 209 SEU avg. per C function. This is already interestingly high. But for AstNode::simplify() I get 1765 SEU and thus my stack of initially 20946 SEU is exhausted after only a few recursions. 
 
Assuming that you have not changed the default Emscripten stack size during compilation, which is 5MB (see here: https://github.com/kripken/emscripten/blob/master/src/settings.js#L54 ),

Changing this parameter has no effect. (I have tried that already and now I have tried again just to be sure.)

In my understanding this is the size of the data stack. But I am running out of space on the call stack of the JavaScript virtual machine. (I would not know how emscripten could have a parameter for that. To the best of my knowledge there is no interface to manipulate the VM call stack size from JavaScript..)

void foo()
{
   int localTempArray[1024*1024]; // Consumes 4MB of stack space.
   // ...
}

But this consumes 4MB on the _data_ stack. For example:

#include <emscripten.h>

void test(int i, char *p)
{
EM_ASM_({
var i = 0;
function stackExplorer() { i++; stackExplorer(); }
try { stackExplorer(); } catch (e) { console.log("--> recursion level: " + $0 + ", free stack: " + i); }
}, i);

char *stuff[256*1024];
stuff[i] = p;
test(i+1, p);
}

int main(int, char **argv)
{
test(1, 0);
return 0;
}

Compile and run:

$ emcc test.cc
$ node a.out.js 
--> recursion level: 1, free stack: 20940
--> recursion level: 2, free stack: 20937
--> recursion level: 3, free stack: 20933
--> recursion level: 4, free stack: 20930

/home/clifford/Work/handicraft/2014/verilearn/yosys/a.out.js:84
      throw ex;
            ^
abort() at Error
    at jsStackTrace (/home/clifford/Work/handicraft/2014/verilearn/yosys/a.out.js:987:13)
    at stackTrace (/home/clifford/Work/handicraft/2014/verilearn/yosys/a.out.js:1004:22)
    at abort (/home/clifford/Work/handicraft/2014/verilearn/yosys/a.out.js:8711:25)
    at __Z4testiPc [test(int, char*)] (/home/clifford/Work/handicraft/2014/verilearn/yosys/a.out.js:5085:70)
    at __Z4testiPc [test(int, char*)] (/home/clifford/Work/handicraft/2014/verilearn/yosys/a.out.js:5098:2)
    at __Z4testiPc [test(int, char*)] (/home/clifford/Work/handicraft/2014/verilearn/yosys/a.out.js:5098:2)
    at __Z4testiPc [test(int, char*)] (/home/clifford/Work/handicraft/2014/verilearn/yosys/a.out.js:5098:2)
    at __Z4testiPc [test(int, char*)] (/home/clifford/Work/handicraft/2014/verilearn/yosys/a.out.js:5098:2)
    at _main (/home/clifford/Work/handicraft/2014/verilearn/yosys/a.out.js:5110:2)
    at Object.asm._main (/home/clifford/Work/handicraft/2014/verilearn/yosys/a.out.js:8491:19)

Notice how this error differs from the error I posted above?

This program runs out of data stack. But on the call stack it has plenty of space left and only takes 3-4 SEU per recursion. My program runs out of call stack. But not because of the number of recursion levels but because of the large number of SEUs that each recursion level costs.
 
I would like to know how I can reduce the number of SEUs that my function takes on the call stack (I would assume that on the JavaScript side it has to do with the number of local variables or something like that) and how this relates to the C++ code that I feed into emcc, so I know how I have to refactor my function to make it work.

..or if it turns to be something like the number of local variables, maybe there is something that emcc can do to automatically encapsulate them and put them on the heap or something to reduce the SEU of the generated function. That would be awesome.

Thanks for reading all the way to the end. Help is very appreciated.

regards,
 - clifford

Jukka Jylänki

unread,
Feb 13, 2015, 1:49:40 PM2/13/15
to emscripte...@googlegroups.com
The Emscripten C/asm.js local stack and the JavaScript execution call stack are two separate entities. The Emscripten C/asm.js local stack lives in the Emscripten HEAP and all the local variables in the C code live in that stack. There are several ways to examine the bytes used in this stack:
  - read the STACKTOP JavaScript variable.
  - take an address of a local variable in a function (the C stack grows up in Emscripten)
  - Use the JS Runtime.stackAlloc() function to allocate some bytes on the C stack, and examine where those bytes were received at. This gives the position the C stack is at.

But however, like you now explained, it doesn't look like your code runs out of the Emscripten C stack, but the JS engine call stack. This stack is completely shielded from manipulation in JS code for security purposes. A hackish way to examine the call stack depth is to count how many lines there are in the string "new Error().stack.toString()", since there's one line per function call in there.

The number of JS local variables in a function affects the maximum depth in terms of calls that the JS call stack can be, so if the compiled version of the function uses up a huge number of local variables, the JS call stack limit may be exhausted quickly.

To limit the number of local variables in a function, you can try the following:
   - use an aggressive optimization (-O3) to remove the number of locals,
   - use the linker flag -s AGGRESSIVE_VARIABLE_ELIMINATION=1 to try to remove the number of locals further (https://github.com/kripken/emscripten/blob/master/src/settings.js#L190),
   - use the function outliner, e.g. -s OUTLINING_LIMIT=5000 (https://github.com/kripken/emscripten/blob/master/src/settings.js#L169)
   - restrict LLVM use of function inlining by a) avoiding setting any --llvm-lto if you happened to be using that, and b) setting e.g. -s INLINING_LIMIT=50 to restrict LLVM inlining (see https://github.com/kripken/emscripten/blob/master/src/settings.js#L169 )
   - manually break up the large function into smaller separate chunks.

If none of those help, there is of course a chance that there's a miscompilation - looking at that function, it seems to be at least special in that it's very large, and that it uses gotos in it, which might be a combination that's quite rare in practice.

--

Clifford Wolf

unread,
Feb 13, 2015, 2:34:23 PM2/13/15
to emscripte...@googlegroups.com
On Friday, February 13, 2015 at 7:49:40 PM UTC+1, jj wrote:
To limit the number of local variables in a function, you can try the following:
   - use an aggressive optimization (-O3) to remove the number of locals,
   - use the linker flag -s AGGRESSIVE_VARIABLE_ELIMINATION=1 to try to remove the number of locals further (https://github.com/kripken/emscripten/blob/master/src/settings.js#L190),
   - use the function outliner, e.g. -s OUTLINING_LIMIT=5000 (https://github.com/kripken/emscripten/blob/master/src/settings.js#L169)
   - restrict LLVM use of function inlining by a) avoiding setting any --llvm-lto if you happened to be using that, and b) setting e.g. -s INLINING_LIMIT=50 to restrict LLVM inlining (see https://github.com/kripken/emscripten/blob/master/src/settings.js#L169 )
   - manually break up the large function into smaller separate chunks.

All of those together brought it from 1700 SEU down to 1100 SEU.
 
If none of those help, there is of course a chance that there's a miscompilation - looking at that function, it seems to be at least special in that it's very large, and that it uses gotos in it, which might be a combination that's quite rare in practice.

I've now counted them: the created JavaScript function has 7044 local variables.

Maybe the gotos create a control flow graph that makes it particularly hard for emcc to keep track of the local variables? Btw: at least on simple inputs that require less than 10 levels of recursions it looks like the function does the right thing.

I will make some more experiments tomorrow and will get back here when I have new results.


Alon Zakai

unread,
Feb 13, 2015, 11:44:36 PM2/13/15
to emscripte...@googlegroups.com
7044 local variables? wow! :) Is this on an unoptimized build, or -O1? I would be surprised to see that on -O2 or -O3.

- Alon


Clifford Wolf

unread,
Feb 14, 2015, 2:07:59 AM2/14/15
to emscripte...@googlegroups.com
On Saturday, February 14, 2015 at 5:44:36 AM UTC+1, Alon Zakai wrote:
7044 local variables? wow! :) Is this on an unoptimized build, or -O1? I would be surprised to see that on -O2 or -O3.

No. This is with "-O3 -s AGGRESSIVE_VARIABLE_ELIMINATION=1 -s OUTLINING_LIMIT=5000 -s INLINING_LIMIT=50", as jj suggested.

My default is -Os without the other options. With that I get a JavaScript function with 10530 local variables.

Clifford Wolf

unread,
Feb 14, 2015, 12:12:39 PM2/14/15
to emscripte...@googlegroups.com
On Saturday, February 14, 2015 at 8:07:59 AM UTC+1, Clifford Wolf wrote:
My default is -Os without the other options. With that I get a JavaScript function with 10530 local variables.

I have now done a little more analysis over my project. This is not even the function with the most local variables, there are two larger ones. One with 12919 and one with 16594 variables:

<skipping first 4979 results>
  4302 __ZN12_GLOBAL__N_114dump_cell_exprERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEENS0_12basic_stringIcS3_NS0_9allocatorIcEEEEPN5Yosys5RTLIL4CellE
  4554 __Z24frontend_verilog_yyparsev
  4592 __ZN12_GLOBAL__N_110ExposePass7executeENSt3__16vectorINS1_12basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEENS6_IS8_EEEEPN5Yosys5RTLIL6DesignE
  4700 __ZN12_GLOBAL__N_110BtorDumper9dump_cellEPKN5Yosys5RTLIL4CellE
  5023 __ZN5Yosys3AST7AstNode8genRTLILEib
  5212 __ZN12_GLOBAL__N_113TechmapWorker14techmap_moduleEPN5Yosys5RTLIL6DesignEPNS2_6ModuleES4_RNSt3__13setIPNS2_4CellENS7_4lessISA_EENS7_9allocatorISA_EEEERKNS7_3mapINS2_8IdStringENS8_ISI_NS2_14sort_by_id_strENSD_ISI_EEEENSB_ISI_EENSD_INS7_4pairIKSI_SL_EEEEEEb
  5512 __ZN12_GLOBAL__N_112replace_cellEPN5Yosys5RTLIL4CellERKNS_7rules_tERKNS4_6bram_tERKNS4_7match_tERNS0_7hashlib4dictINSt3__112basic_stringIcNSF_11char_traitsIcEENSF_9allocatorIcEEEEiNSD_8hash_opsISL_EEEEi
 10604 __ZN5Yosys3AST7AstNode8simplifyEbbbiibb
 12919 __ZN5Yosys6SatGen10importCellEPNS_5RTLIL4CellEi
 16594 __ZN12_GLOBAL__N_119replace_const_cellsEPN5Yosys5RTLIL6DesignEPNS1_6ModuleEbbbbb


     0 -   100 vars:  4270 functions
   100 -   200 vars:   365 functions
   200 -   500 vars:   188 functions
   500 -  1000 vars:    76 functions
  1000 -  2000 vars:    51 functions
  2000 -  5000 vars:    33 functions
  5000 - 10000 vars:     3 functions
 10000 - 16594 vars:     3 functions

(the difference in number of variables for the simply() function is most likely because I am now using a script to count, before it was just doing some hacks in my text editor..)

The frontend_verilog_yyparse at 4554 variables is a parser generated with bison. So whatever that looks like should not be too uncommon in larger projects. (But I'm not sure how many of those are ported to JavaScript. ;)

On Saturday, February 14, 2015 at 5:44:36 AM UTC+1, Alon Zakai wrote:
7044 local variables? wow! :) Is this on an unoptimized build, or -O1? I would be surprised to see that on -O2 or -O3.
 
So either I am doing something fundamentally wrong in my project or functions with thousands of local javascript variables is something one just gets on occasion..

Under what circumstances does emcc create additional local variables? I would have assumed that the number local variables would rather be in the order of number of register you'd usually have in a CPU.. I'm thankful for any insights or pointers to documentation. (I haven't found any documentation on those implementation details, but maybe I was just looking in the wrong places..).

What this functions in my list above all have in common is that they are rather large, usually built around one large switch statement or series of if-statements. The sourcecode to the two functions with the most javascript vars (importCell and replace_cost_cells) can be found here, just an case anyone is curious:


regards,
 - clifford

Alon Zakai

unread,
Feb 14, 2015, 4:40:36 PM2/14/15
to emscripte...@googlegroups.com
Is it possible you are not optimizing the source files, and only passing in the optimization flag during linking? Or vice versa? Either can cause that. See http://kripken.github.io/emscripten-site/docs/compiling/Building-Projects.html#building-projects-with-optimizations

If that's not it, please make a standalone testcase, you might be hitting a bug here (although I am not aware of anything open right now that could cause that).

- Alon


--

Clifford Wolf

unread,
Feb 15, 2015, 4:08:58 AM2/15/15
to emscripte...@googlegroups.com
On Saturday, February 14, 2015 at 10:40:36 PM UTC+1, Alon Zakai wrote:
Is it possible you are not optimizing the source files, and only passing in the optimization flag during linking? Or vice versa? Either can cause that. See http://kripken.github.io/emscripten-site/docs/compiling/Building-Projects.html#building-projects-with-optimizations

I am so sorry everyone. I only built with -Os and did not link with it. I double and triple checked that so many times but still messed it up..
 
Now that I also link with -Os my simplifier function has 180 SEU and works fine.

Thank you so much everyone. I'm sorry that I bothered you with such a trivial problem.

regards,
 - clifford

Chad Austin

unread,
Feb 17, 2015, 3:41:20 AM2/17/15
to emscripte...@googlegroups.com
No worries.  :)  As with any human factors/tooling problem, if a smart person with good intentions encounters a bug or usability pitfall, it's likely dozens or hundreds of other people will too...  Perhaps the compiler could do a better job reducing the number of JavaScript variables even without a -O flag?

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Chad Austin
Technical Director, IMVU

Clifford Wolf

unread,
Feb 17, 2015, 3:48:30 AM2/17/15
to emscripte...@googlegroups.com
On Tuesday, February 17, 2015 at 9:41:20 AM UTC+1, Chad Austin wrote:
No worries.  :)  As with any human factors/tooling problem, if a smart person with good intentions encounters a bug or usability pitfall, it's likely dozens or hundreds of other people will too...  Perhaps the compiler could do a better job reducing the number of JavaScript variables even without a -O flag?

I don't think that this would be necessary. But a check in the linker that sees if link-time optimizations and compile-time optimization match (and produce a warning if they don't) would have helped a lot. I can imagine that this is a common problem as you usually don't pass options like -O to the linker, so Makefiles have to be changed and things can go wrong. (Also: In many cases there might not be an observable problem with the generated code (other than slow code execution and big output files) and the user might just incorrectly assume that all available optimization was used.)

Bruce Mitchener

unread,
Feb 17, 2015, 4:56:35 AM2/17/15
to emscripte...@googlegroups.com
In this case, the compiler is generating LLVM bitcode files which don't say what optimization flags were used so there isn't something for the linker to compare against. It is also legitimate to use differing optimization flags for compilation and linking. It is also possible (and with Emscripten common) to use different optimization flags for different files in the compilation phase. An example of that is that -Oz is used on the libcxx sources that are built and stored in the cache.

I agree with Chad though that defaults could be better. I have had to spend a lot of time trying to sort out what flags to use...

 - Bruce

Sent from my iPhone
--

Alon Zakai

unread,
Feb 17, 2015, 2:11:53 PM2/17/15
to emscripte...@googlegroups.com
Yeah, hard to warn on the opts directly for those reasons.

But I added on incoming a warning based on the number of variables, if full optimizations are not run. Clifford, can you please verify that in your codebase, the warning shows up when you link with -O0 or -O1 and not when you link with -O2 or above?

- Alon

Reply all
Reply to author
Forward
0 new messages