CUBLAS_STATUS_NOT_INITIALIZED Error

962 views
Skip to first unread message

Prajwal Rao

unread,
Mar 6, 2018, 9:33:58 AM3/6/18
to kaldi-help
Hi all, 

I have been trying to run librispeech example with some gsm data.
Although my cuda(version 8.0) is working alright. nnet-train-simple is throwing me an error 



# nnet-shuffle-egs --buffer-size=5000 --srand=0 ark:exp/nnet5a_clean_100_gpu/egs/egs.1.0.ark ark:- | nnet-train-simple --minibatch-size=256 --srand=0 exp/nnet5a_clean_100_gpu/0.mdl ark:- exp/nnet5a_clean_100_gpu/1.1.mdl
# Started at Tue Mar  6 19:47:59 IST 2018
#
nnet
-shuffle-egs --buffer-size=5000 --srand=0 ark:exp/nnet5a_clean_100_gpu/egs/egs.1.0.ark ark:-
nnet
-train-simple --minibatch-size=256 --srand=0 exp/nnet5a_clean_100_gpu/0.mdl ark:- exp/nnet5a_clean_100_gpu/1.1.mdl
WARNING
(nnet-train-simple[5.2.119~123-807dc]:SelectGpuId():cu-device.cc:182) Suggestion: use 'nvidia-smi -c 3' to set compute exclusive mode
LOG
(nnet-train-simple[5.2.119~123-807dc]:SelectGpuIdAuto():cu-device.cc:300) Selecting from 1 GPUs
LOG
(nnet-train-simple[5.2.119~123-807dc]:SelectGpuIdAuto():cu-device.cc:315) cudaSetDevice(0): Quadro P5000 free:15373M, used:892M, total:16265M, free/total:0.945127
LOG
(nnet-train-simple[5.2.119~123-807dc]:SelectGpuIdAuto():cu-device.cc:364) Trying to select device: 0 (automatically), mem_ratio: 0.945127
LOG
(nnet-train-simple[5.2.119~123-807dc]:SelectGpuIdAuto():cu-device.cc:383) Success selecting device 0 free mem ratio: 0.945127
ERROR
(nnet-train-simple[5.2.119~123-807dc]:FinalizeActiveGpu():cu-device.cc:217) cublasStatus_t 1 : "CUBLAS_STATUS_NOT_INITIALIZED" returned from 'cublasCreate(&handle_)'


[ Stack-Trace: ]


kaldi
::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi
::MessageLogger::~MessageLogger()
kaldi
::CuDevice::FinalizeActiveGpu()
kaldi
::CuDevice::SelectGpuId(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
main
__libc_start_main
_start




bash
: line 1:  8284 Broken pipe             nnet-shuffle-egs --buffer-size=5000 --srand=0 ark:exp/nnet5a_clean_100_gpu/egs/egs.1.0.ark ark:-
     
8285 Segmentation fault      (core dumped) | nnet-train-simple --minibatch-size=256 --srand=0 exp/nnet5a_clean_100_gpu/0.mdl ark:- exp/nnet5a_clean_100_gpu/1.1.mdl
# Accounting: time=2 threads=1
# Ended (code 139) at Tue Mar  6 19:48:01 IST 2018, elapsed time 2 seconds


Any suggestions?

Thanks in advance.

Regards,
Prajwal

Daniel Povey

unread,
Mar 6, 2018, 12:20:37 PM3/6/18
to kaldi-help

First try running the tests in cudamatrix/.
If they fail, it is likely either:
  - You have an incompatible version of cublas on your path...  do something like `ldd ./cu-vector-test` to figure out which one.
  - There is some issue of write permission to your directory ~/.nv/.  Yenda reported problems one time when that was on NFS, due to some kind of permission issue.  You can do e.g. `strace ./cu-vector-test` to see whether, before it dies, it attempts to access some subdirectory of ~/.nv/.


Also, that error can occasionally arise randomly (non-repeatably), when running multiple jobs over NFS, due to contention for locking a file in ~/.nv/.  That is due to a driver bug whose fix is going to be released soon.   But since it happened on the first iteration, that won't be your problem.

Dan
   

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/ae6f4d65-e485-4a64-817b-92d2f8fb523a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Joe

unread,
Jan 12, 2019, 10:27:04 PM1/12/19
to kaldi-help
Hi Dan,

I encountered the same problem with cuda-9.0 and kaldi branch 5.4:

$ ./cu-device-test
LOG
([5.4.271~1-e50bd]:SelectGpuId():cu-device.cc:127) Manually selected to compute on CPU.
......
LOG
([5.4.271~1-e50bd]:TestCuMatrixResize():cu-device-test.cc:76) For CuMatrix::Resize<double>, for size_multiple = 16, speed was 790.643 gigaflops.

LOG
([5.4.271~1-e50bd]:SelectGpuId():cu-device.cc:197) CUDA setup operating under Compute Exclusive Mode.
ERROR ([5.4.271~1-e50bd]:FinalizeActiveGpu():cu-device.cc:245) cublasStatus_t 1 : "CUBLAS_STATUS_NOT_INITIALIZED" returned from 'cublasCreate(&cublas_handle_)'

[ Stack-Trace: ]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::FatalMessageLogger::~FatalMessageLogger()
kaldi::CuDevice::FinalizeActiveGpu()
kaldi::CuDevice::SelectGpuId(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
main
__libc_start_main
_start

terminate called after throwing an instance of 'std::runtime_error'
  what():
Aborted (core dumped)

$ nvcc -V
nvcc
: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

$ ldd ./cu-device-test
        linux
-vdso.so.1 =>  (0x00007ffdd41b2000)
        libcblas
.so.3 => /usr/lib/libcblas.so.3 (0x00007fef76056000)
        liblapack_atlas
.so.3 => /usr/lib/liblapack_atlas.so.3 (0x00007fef75dfa000)
        libpthread
.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fef75bdd000)
        libdl
.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fef759d9000)
        libcublas
.so.9.0 => /usr/local/cuda-9.0/lib64/libcublas.so.9.0 (0x00007fef725a3000)
        libcusparse
.so.9.0 => /usr/local/cuda-9.0/lib64/libcusparse.so.9.0 (0x00007fef6ee3d000)
        libcudart
.so.9.0 => /usr/local/cuda-9.0/lib64/libcudart.so.9.0 (0x00007fef6ebd0000)
        libcurand
.so.9.0 => /usr/local/cuda-9.0/lib64/libcurand.so.9.0 (0x00007fef6ac6c000)
        libstdc
++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fef6a8ea000)
        libm
.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fef6a5e1000)
        libgcc_s
.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fef6a3cb000)
        libc
.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fef6a001000)
        libatlas
.so.3 => /usr/lib/libatlas.so.3 (0x00007fef69a63000)
        libgfortran
.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00007fef69738000)
        libf77blas
.so.3 => /usr/lib/libf77blas.so.3 (0x00007fef69518000)
       
/lib64/ld-linux-x86-64.so.2 (0x00007fef76278000)
        librt
.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fef69310000)
        libquadmath
.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007fef690d1000)

And the permission of ~/.nv is ok.

The kaldi source is the latest commit of the 5.4 branch, and I compiled kaldi by:
./configure --cudatk-dir=/usr/local/cuda-9.0
make


So could you please give me any suggestion?
Thank you.


Cheers,
Joe

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

Daniel Povey

unread,
Jan 12, 2019, 10:51:29 PM1/12/19
to kaldi-help
Check what I asked the original poster to check, i.e. whether the directory
 ~/.nv/
has write permissions.
I haven't seen this error for a while and may not remember exactly how to debug it.

Daniel Povey

unread,
Jan 12, 2019, 10:53:16 PM1/12/19
to kaldi-help
Also possibly doing
export CUDA_CACHE_DISABLE=1 
in your profile or the path.sh might help.

Joe

unread,
Jan 13, 2019, 9:09:10 AM1/13/19
to kaldi-help
Thank you Dan, I tried CUDA_CACHE_DISABLE=1 and it worked! (although I don't understand the reason)
Amazing!

gaoxing...@163.com

unread,
Mar 18, 2019, 10:22:38 AM3/18/19
to dpovey, kaldi-help


Maybe I didn't make it clear. I mean on G layer or FSA layer, artificially construct a <eps> transtition, which is parallel to the symbol of spoken noise, so that if there is no noise in speech, then it can be skipped, if there is, then it can be detected.



On Sun, Mar 17, 2019 at 10:38 PM Daniel Povey <dpo...@gmail.com> wrote:
it is possible to work out from the normal alignments etc., whether there was silence there.
Search in ru.sh for get_prons.sh and dict_dir_add_pronprobs.sh, to see how they do it.

On Sun, Mar 17, 2019 at 10:30 PM gaoxing...@163.com <gaoxing...@163.com> wrote:
Also, I know that silence is actually such a principle. I want to know which program it is.


 
Date: 2019-03-18 10:16
Subject: Re: I have some question about alignment,thanks
You probably don't really want the #0 there, although in some circumstances it wouldn't matter because it would be deleted later anyway.


On Sun, Mar 17, 2019 at 10:13 PM gaoxing...@163.com <gaoxing...@163.com> wrote:
Hi, Dan

I want to get a alternative alignment graph , and I design the following structure.

I want to skip some special word when this word does not really exist in the  wave.

But I found the resuting aligments will tranverse the "word" exactly but not the "#0" arc.

Is there something error ?

Thank~











Catch(03-18-10-2(03-18-22-21-32).jpg

Daniel Povey

unread,
Mar 18, 2019, 11:35:16 AM3/18/19
to kaldi-help
Well there are two separate issues: what to do in test time versus training.  In training time you 
only have a linear transcript, at least using the normal scripts, so you have to do it in L.fst.

I don't see how what you propose really differs from the normal way we do optional silence,
except you want the symbol to be displayed.  It is quite possible to display it from the regular decoding,
because the information is not lost.  E.g. use the --silence-label option to lattice-align-words.  That only works, 
though, if you have a single symbol for all your optional-silences.

Dan




--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

gaoxing...@163.com

unread,
Mar 18, 2019, 9:58:40 PM3/18/19
to kaldi-help

Thank you very much for your patient explanation.
I understand what you mean.In fact, I want two optional symbols, not just one, in both the testing and training phases. The two are silence and spoken noise.
And both of them can be  adjusted  by different weights assigned artificially  in advance and resulting different decoding paths.
This method can split speech more exactly.

Thanks~

Xinglong


Catch(03-18-10-2(03-19-09-47-31).jpg

Daniel Povey

unread,
Mar 18, 2019, 10:21:16 PM3/18/19
to kaldi-help
Unless you have specific supervision information to train them separately, I doubt very much that you would get any benefit out of it.  Even if you do have supervision for noise etc., it's usually best just to map it to silence.
You could certainly try to change the make_lexicon_* scripts to support adding a second optional silence, but it would require an understanding of FST determinization issues and disambiguation symbols.

Dan

gaoxing...@163.com

unread,
Mar 18, 2019, 10:44:10 PM3/18/19
to kaldi-help
Thank you very much.
Can I implement this strategy in the test phase?
I want to add a <eps> arc when  constructing  the  decoding network. 
This <eps> arc has weight assigned in advance, and  it is parallel to <spoken noise>. In this way, if noise exists, it passes through, otherwise it will be skipped directly.
I've tried to do that.
However, I found that, in any case, the decoding path will go through this <spoken nosie> path.


Catch(03-18-10-2(03-19-10-39-00).jpg

Daniel Povey

unread,
Mar 18, 2019, 10:54:47 PM3/18/19
to kaldi-help
I suggest to search for hbka.pdf (chapter by Mohri) on FST-based decoding graph construction and try to read it, to understand disambiguation symbols.

I doubt you will be able to do this, and I don't have time to go through the individual steps.

gaoxing...@163.com

unread,
Mar 18, 2019, 11:07:16 PM3/18/19
to kaldi-help

Okay, thanks.
I think this will be much easier implemented throgh tree-based decoder.
And <eps> can not be inserted in transcprits or arpa file, or it will not be determinizated.
All symbols should be have physical meanings.

Catch(03-18-10-2(03-19-11-03-40).jpg

Daniel Povey

unread,
Mar 18, 2019, 11:15:23 PM3/18/19
to kaldi-help
that's not correct; read the thing I pointed you to.

gaoxing...@163.com

unread,
May 6, 2019, 1:52:51 AM5/6/19
to kaldi-help

 # nnet3-train --use-gpu=wait --read-cache=exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/cache.6 --print-interval=10 --momentum=0.0 --max-param-change=2.0 --backstitch-training-scale=0.0 --l2-regularize-factor=0.333333333333 --backstitch-training-interval=1 --srand=6 "nnet3-copy --learning-rate=0.00504944042974 --scale=1.0 exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/6.mdl - |" "ark,bg:nnet3-copy-egs --frame=5              ark:exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/egs/egs.21.ark ark:- |             nnet3-shuffle-egs --buffer-size=5000             --srand=6 ark:- ark:- |              nnet3-merge-egs --minibatch-size=512 ark:- ark:- |" exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/7.3.raw 
# Started at Mon May  6 01:38:20 EDT 2019
#
nnet3-train --use-gpu=wait --read-cache=exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/cache.6 --print-interval=10 --momentum=0.0 --max-param-change=2.0 --backstitch-training-scale=0.0 --l2-regularize-factor=0.333333333333 --backstitch-training-interval=1 --srand=6 'nnet3-copy --learning-rate=0.00504944042974 --scale=1.0 exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/6.mdl - |' 'ark,bg:nnet3-copy-egs --frame=5              ark:exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/egs/egs.21.ark ark:- |             nnet3-shuffle-egs --buffer-size=5000             --srand=6 ark:- ark:- |              nnet3-merge-egs --minibatch-size=512 ark:- ark:- |' exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/7.3.raw 
WARNING (nnet3-train[5.5]:SelectGpuId():cu-device.cc:207) Waited 0 seconds before creating CUDA context
LOG (nnet3-train[5.5]:SelectGpuId():cu-device.cc:216) CUDA setup operating under Compute Exclusive Mode.
ERROR (nnet3-train[5.5]:FinalizeActiveGpu():cu-device.cc:264) cublasStatus_t 1 : "CUBLAS_STATUS_NOT_INITIALIZED" returned from 'cublasCreate(&cublas_handle_)'

[ Stack-Trace: ]
kaldi::MessageLogger::LogMessage() const
kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)
kaldi::CuDevice::FinalizeActiveGpu()
kaldi::CuDevice::SelectGpuId(std::string)
main
__libc_start_main
nnet3-train() [0x497319]

kaldi::KaldiFatalError
# Accounting: time=1 threads=1
# Ended (code 255) at Mon May  6 01:38:21 EDT 2019, elapsed time 1 seconds

 
Date: 2019-01-13 11:53
InsertPic_.png

gaoxing...@163.com

unread,
May 6, 2019, 1:54:12 AM5/6/19
to kaldi-help
 # nnet3-train --use-gpu=wait --read-cache=exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/cache.6 --print-interval=10 --momentum=0.0 --max-param-change=2.0 --backstitch-training-scale=0.0 --l2-regularize-factor=0.333333333333 --backstitch-training-interval=1 --srand=6 "nnet3-copy --learning-rate=0.00504944042974 --scale=1.0 exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/6.mdl - |" "ark,bg:nnet3-copy-egs --frame=5              ark:exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/egs/egs.21.ark ark:- |             nnet3-shuffle-egs --buffer-size=5000             --srand=6 ark:- ark:- |              nnet3-merge-egs --minibatch-size=512 ark:- ark:- |" exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/7.3.raw 
# Started at Mon May  6 01:38:20 EDT 2019
#
nnet3-train --use-gpu=wait --read-cache=exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/cache.6 --print-interval=10 --momentum=0.0 --max-param-change=2.0 --backstitch-training-scale=0.0 --l2-regularize-factor=0.333333333333 --backstitch-training-interval=1 --srand=6 'nnet3-copy --learning-rate=0.00504944042974 --scale=1.0 exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/6.mdl - |' 'ark,bg:nnet3-copy-egs --frame=5              ark:exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/egs/egs.21.ark ark:- |             nnet3-shuffle-egs --buffer-size=5000             --srand=6 ark:- ark:- |              nnet3-merge-egs --minibatch-size=512 ark:- ark:- |' exp/nnet3/1000h_segmented_6w_other_offline_1024_deeper/7.3.raw 
WARNING (nnet3-train[5.5]:SelectGpuId():cu-device.cc:207) Waited 0 seconds before creating CUDA context
LOG (nnet3-train[5.5]:SelectGpuId():cu-device.cc:216) CUDA setup operating under Compute Exclusive Mode.
ERROR (nnet3-train[5.5]:FinalizeActiveGpu():cu-device.cc:264) cublasStatus_t 1 : "CUBLAS_STATUS_NOT_INITIALIZED" returned from 'cublasCreate(&cublas_handle_)'

[ Stack-Trace: ]

kaldi::MessageLogger::LogMessage() const
kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)
kaldi::CuDevice::FinalizeActiveGpu()

kaldi::CuDevice::SelectGpuId(std::string)
main
__libc_start_main
nnet3-train() [0x497319]

kaldi::KaldiFatalError
# Accounting: time=1 threads=1
# Ended (code 255) at Mon May  6 01:38:21 EDT 2019, elapsed time 1 seconds
 
Date: 2019-01-13 11:53
Subject: Re: [kaldi-help] CUBLAS_STATUS_NOT_INITIALIZED Error
InsertPic_24D3.png

Daniel Povey

unread,
May 6, 2019, 12:29:13 PM5/6/19
to kaldi-help
Try adding
export CUDA_CACHE_DISABLE=1
to your path.sh.
That error can happen when there is difficulty locking something in the ~/.nv/ directory.

seiten kaku

unread,
Feb 19, 2020, 11:47:14 AM2/19/20
to kaldi-help
I encountered the same problem after downgrading cuda from 10.2 to 10.1.
Following the suggestion by dan I did 'ldd ./cu-vector-test' and found libcublas.so and libcublasLt.so are still linked to their 10.2.xxx versions, so I modified the link and it worked.


Prajwal Rao於 2018年3月6日星期二 UTC+8下午10時33分58秒寫道:
Reply all
Reply to author
Forward
0 new messages