cuda runtime error (60) : peer mapping resources exhausted while require 'cunn'/‘cutorch’

Qingnan Fan

unread,

Jun 22, 2016, 8:57:16 AM6/22/16

to torch7

Here is the full print info.

--------------------------------------------------------

THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-9142/cutorch/lib/THC/THCGeneral.c line=176 error=60 : peer mapping resources exhausted

qlua: cuda runtime error (60) : peer mapping resources exhausted at /tmp/luarocks_cutorch-scm-1-9142/cutorch/lib/THC/THCGeneral.c:176

stack traceback:

[C]: at 0x7f89b4f589c0

[C]: at 0x7f8941b84970

[C]: in function 'require'

/home/qingnan/torch/install/share/lua/5.1/cutorch/init.lua:2: in main chunk

[C]: in function 'require'

/mnt/qn/code/imgSmooth/test2.lua:267: in main chunk

--------------------------------------------------------

I can install torch successfully, but while I require the cuda-related module like cunn and cutorch, it pops out this error.

I'm using two servers, torch works fine on one with 8 K80 gpus, but fails on this one(16 k80 gpus). Does this mean torch can't handle so many gpus at the same time? Every time I run torch, it starts a thread on each gpu.

Could anybody help me with this?

alban desmaison

unread,

Jun 22, 2016, 10:32:37 AM6/22/16

to torch7

You can use the nvidia environment variable to prevent torch from initializing on the gpus that you don't want to use:

CUDA_VISIBLE_DEVICES=0,1,2,3 th -lcutorch

Qingnan Fan

unread,

Jun 27, 2016, 11:51:05 AM6/27/16

to torch7 on behalf of alban desmaison

Great, it works! But why cutorch.setDevice() method can't allocate a single gpu for computation?

Sorry for the late reply!

--
You received this message because you are subscribed to a topic in the Google Groups "torch7" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/torch7/gH0KlLJiVco/unsubscribe.
To unsubscribe from this group and all its topics, send an email to torch7+un...@googlegroups.com.
To post to this group, send email to tor...@googlegroups.com.
Visit this group at https://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.

alban desmaison

unread,

Jun 27, 2016, 12:26:34 PM6/27/16

to torch7

When you use the CUDA_VISIBLE_DEVICES macro, only the devices specified there are usable. Thus you cannot do a setDevice on the out of range elements. You can use cutorch.getDeviceCount() to see how many devices are available

On Monday, June 27, 2016 at 4:51:05 PM UTC+1, Qingnan Fan wrote:

Great, it works! But why cutorch.setDevice() method can't allocate a single gpu for computation?

Sorry for the late reply!

Reply all

Reply to author

Forward