Recent release of CUDA 8.0 library does not support EXCLUSIVE_THREAD computation mode...
The problem is when I run the Kaldi recipe in my single machine with EXCLUSIVE_PROCESS mode,
nnet3-train --print-interval=10 --momentum=0.5 --max-param-change=2.0 --optimization.min-deriv-time=0 'nnet3-am-c
opy --raw=true --learning-rate=0.0009 exp/nnet3/lstm_bidirectional_sp/0.mdl - |' 'ark,bg:nnet3-copy-egs --left-co
ntext=42 --right-context=42 ark:exp/nnet3/lstm_bidirectional_sp/egs/egs.1.ark ark:- | nnet3-shuffle-egs --buffer-
size=5000 --srand=0 ark:- ark:-| nnet3-merge-egs --minibatch-size=50 --measure-output-frames=false --discard-partial-minibatches=true ark:- ark:- |' exp/nnet3/lstm_bidirectional_sp/1.1.raw
WARNING (nnet3-train:SelectGpuId():cu-device.cc:137) Will try again to get a GPU after 20 seconds.
Wed Jun 15 11:45:39 2016
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.27 Driver Version: 367.27 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 0000:02:00.0 Off | N/A |
| 27% 35C P8 6W / 180W | 2MiB / 8113MiB | 0% E. Process |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 1080 Off | 0000:03:00.0 Off | N/A |
| 27% 36C P8 5W / 180W | 2MiB / 8113MiB | 0% E. Process |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX 1080 Off | 0000:82:00.0 Off | N/A |
| 27% 34C P8 6W / 180W | 2MiB / 8113MiB | 0% E. Process |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX 1080 Off | 0000:83:00.0 Off | N/A |
| 27% 29C P8 6W / 180W | 10MiB / 8113MiB | 0% E. Process |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
LOG (nnet3-train:SelectGpuId():cu-device.cc:146) num-gpus=4. Device 0: all CUDA-capable devices are busy or unavailable. Device 1: all CUDA-capable devices are busy or unavailable. Device 2: all CUDA-capable devices are busy or unavailable. Device 3: all CUDA-capable devices are busy or unavailable.
ERROR (nnet3-train:SelectGpuId():cu-device.cc:147) Failed to create CUDA context, no more unused GPUs?
ERROR (nnet3-train:SelectGpuId():cu-device.cc:147) Failed to create CUDA context, no more unused GPUs?