torch.setnumthreads(4)
torch.getnumthreads()
4
--
You received this message because you are subscribed to the Google Groups "torch7" group.
To unsubscribe from this group and stop receiving emails from it, send an email to torch7+un...@googlegroups.com.
To post to this group, send email to tor...@googlegroups.com.
Visit this group at http://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.
The neural net code uses OpenMP for parallelization, and if it is still using only 1 cpu, you are either using a module that has blas optimizations (and the blas you are using is not multi-CPU enabled), or OpenMP's numthreads is being overriden maybe by an environment variable (OMP_NUM_THREADS), or the nn module you are using doesn't have any parallelization code.
-- The C compiler identification is GNU 4.9.1
-- The CXX compiler identification is GNU 4.9.1
...
-- Check for working CXX compiler: /usr/local/bin/g++-4.9
-- Check for working CXX compiler: /usr/local/bin/g++-4.9 -- works
...
-- Try OpenMP C flag = [-fopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Success
-- Try OpenMP CXX flag = [-fopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Success
-- Found OpenMP: -fopenmp
-- Compiling with OpenMP support
torch.setnumthreads(4)
model = nn.Sequential()model:add(nn.Linear(200, 10))model:add(nn.Tanh())model:add(nn.Linear(10, 2))model:add(nn.LogSoftMax())
If it's a small model, try using mini-batches to have enough to parallelize over, most nn modules take dim+1 inputs (where dim == expected number of dimensions) and compute things in batches.