I just want to implement the reluGRU in the interspeech2017 paper "Improving speech recognition by revising gated recurrent units"
I have just sucessfully implement the simple GRU. it is ok.
But when I just change the tanh to Relu complement. The kaldi is crashed. The log is like this:
nnet3-chain-train --apply-deriv-weights=False --l2-regularize=5e-05 --leaky-hmm-coefficient=0.1 --write-cache=exp/chain/gru_6j_relu_ld5_sp/cache.1 --xent-regularize=0.025 --optimization.min-deriv-time=-8 --optimization.max-deriv-time-relative=15 --print-interval=10 --momentum=0.0 --max-param-change=1.41421356237 --backstitch-training-scale=0.0 --backstitch-training-interval=1 --srand=0 'nnet3-am-copy --raw=true --learning-rate=0.003 --scale=0.99 exp/chain/gru_6j_relu_ld5_sp/0.mdl - |' exp/chain/gru_6j_relu_ld5_sp/den.fst 'ark,bg:nnet3-chain-copy-egs --frame-shift=1 ark:/nobackup/f1/asr/zhangshaofu/kaldi/egs/swbdgru/s5c/exp/chain/gru_6j_ld5_sp/egs/cegs.1.ark ark:- | nnet3-chain-shuffle-egs --buffer-size=5000 --srand=0 ark:- ark:- | nnet3-chain-merge-egs --minibatch-size=32 ark:- ark:- |' exp/chain/gru_6j_relu_ld5_sp/1.1.raw
LOG (nnet3-chain-train[5.2]:IsComputeExclusive():cu-device.cc:263) CUDA setup operating under Compute Exclusive Process Mode.
LOG (nnet3-chain-train[5.2]:FinalizeActiveGpu():cu-device.cc:225) The active GPU is [3]: Tesla K20m free:4704M, used:95M, total:4799M, free/total:0.980196 version 3.5
nnet3-am-copy --raw=true --learning-rate=0.003 --scale=0.99 exp/chain/gru_6j_relu_ld5_sp/0.mdl -
WARNING (nnet3-am-copy[5.2]:Check():nnet-nnet.cc:783) Node lda.delayed is never used to compute any output.
LOG (nnet3-am-copy[5.2]:main():nnet3-am-copy.cc:140) Copied neural net from exp/chain/gru_6j_relu_ld5_sp/0.mdl to raw format as -
WARNING (nnet3-chain-train[5.2]:Check():nnet-nnet.cc:783) Node lda.delayed is never used to compute any output.
WARNING (nnet3-chain-train[5.2]:Check():nnet-nnet.cc:783) Node lda.delayed is never used to compute any output.
nnet3-chain-shuffle-egs --buffer-size=5000 --srand=0 ark:- ark:-
nnet3-chain-merge-egs --minibatch-size=32 ark:- ark:-
nnet3-chain-copy-egs --frame-shift=1 ark:/nobackup/f1/asr/zhangshaofu/kaldi/egs/swbdgru/s5c/exp/chain/gru_6j_ld5_sp/egs/cegs.1.ark ark:-
ASSERTION_FAILED (nnet3-chain-train[5.2]:HouseBackward():qr.cc:124) : 'KALDI_ISFINITE(sigma) && "Tridiagonalizing matrix that is too large or has NaNs."'
[ Stack-Trace: ]
nnet3-chain-train() [0x11a26c0]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::MessageLogger::~MessageLogger()
kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)
void kaldi::HouseBackward<float>(int, float const*, float*, float*)
kaldi::SpMatrix<float>::Tridiagonalize(kaldi::MatrixBase<float>*)
kaldi::SpMatrix<float>::Eig(kaldi::VectorBase<float>*, kaldi::MatrixBase<float>*) const
kaldi::nnet3::OnlineNaturalGradient::PreconditionDirectionsInternal(int, float, kaldi::Vector<float> const&, kaldi::CuMatrixBase<float>*, kaldi::CuMatrixBase<float>*, kaldi::CuVectorBase<float>*, float*)
kaldi::nnet3::OnlineNaturalGradient::PreconditionDirections(kaldi::CuMatrixBase<float>*, kaldi::CuVectorBase<float>*, float*)
kaldi::nnet3::OnlineNaturalGradient::PreconditionDirections(kaldi::CuMatrixBase<float>*, kaldi::CuVectorBase<float>*, float*)
.
.
.
kaldi::nnet3::OnlineNaturalGradient::PreconditionDirections(kaldi::CuMatrixBase<float>*, kaldi::CuVectorBase<float>*, float*)
kaldi::nnet3::NaturalGradientAffineComponent::Update(std::string const&, kaldi::CuMatrixBase<float> const&, kaldi::CuMatrixBase<float> const&)
kaldi::nnet3::AffineComponent::Backprop(std::string const&, kaldi::nnet3::ComponentPrecomputedIndexes const*, kaldi::CuMatrixBase<float> const&, kaldi::CuMatrixBase<float> const&, kaldi::CuMatrixBase<float> const&, void*, kaldi::nnet3::Component*, kaldi::CuMatrixBase<float>*) const
kaldi::nnet3::NnetComputer::ExecuteCommand()
kaldi::nnet3::NnetComputer::Run()
kaldi::nnet3::NnetChainTrainer::TrainInternal(kaldi::nnet3::NnetChainExample const&, kaldi::nnet3::NnetComputation const&)
kaldi::nnet3::NnetChainTrainer::Train(kaldi::nnet3::NnetChainExample const&)
main
__libc_start_main
nnet3-chain-train() [0xcf4999]
# Accounting: time=6 threads=1
# Finished at Thu Oct 26 19:15:44 CST 2017 with status 134