threads: 'Cannot allocate memory'

178 views
Skip to first unread message

Felix

unread,
Nov 19, 2015, 9:18:26 AM11/19/15
to torch7
Hi

I am using the torch threads module. From time to time I get the following error when the thread pool is created:


/opt/torch/install/share/lua/5.1/threads/threads.lua:264:

[thread 1 callback] /opt/torch/install/share/lua/5.1/sys/init.lua:38: attempt to index local 'f' (a nil value)

stack traceback:

      /opt/torch/install/share/lua/5.1/sys/init.lua:38: in function 'execute'

      /opt/torch/install/share/lua/5.1/sys/init.lua:71: in function 'uname'

      /opt/torch/install/share/lua/5.1/sys/init.lua:81: in main chunk

      [C]: in function 'require'

      /opt/torch/install/share/lua/5.1/image/init.lua:33: in main chunk

      [C]: in function 'require'

      trainParallel.lua:123: in function <trainParallel.lua:120>


whereby, trainParallel.lua:120 is the following function definition:

function(threadid) --This function is executed only once when the thread pool is created.

  require 'torch'

  require 'image'

  require 'cunn'

  print('Starting a new child thread ' .. threadid)

end


Looking at the code in init.lua, where the exception is thrown, it tries to do the following:


...

Init.lua p37:        local f = io.popen(cmd)

Init.lua p38:        local s = f:read('*all')

...

Init.lua p71:        local os = execute('uname -a')


After the program aborts, I can execute in torch "os.execute('uname -a')" manually and get the error: 'Cannot allocate memory 12'


Torch itself seems not to be out of memory.


Any ideas what might be wrong?


Regards, Felix



Felix

unread,
Nov 25, 2015, 5:19:29 PM11/25/15
to torch7
The problems seemed related to loading much training data (about 20GB). Despite having still quite some free memory, it seems that Torch starts to have issues if I allocate too much data.
If I run it with lesser data then it worked without problems.

alban desmaison

unread,
Nov 26, 2015, 4:24:37 AM11/26/15
to torch7
You may want to give `threads.sharedserialization` a try (https://github.com/torch/threads/blob/master/test/test-threads-shared.lua#L11)
In the case where you ONLY READ these 20GB, they will be shared between your threads
Reply all
Reply to author
Forward
0 new messages