Thanks for your fast reply
:)
Your instructions definitely make sense; the commands work immediately and emit files of roughly the right size.
Graph/Model mismatch failure:
However, when I start decoding with the newly-built HCLG.fst, it fails immediately like this: "Likely graph/model mismatch (graph built from wrong model?)"
Is it intended that the resulting HCLG.fst will be usable as-is with the out-of-the-box model, for arbitrary decoders? Or maybe more adjustments are necessary?
I tried applying several of the options from fstcompose --help, such as ----compose_filter and --fst_align. Some of these take a surprising amount of RAM but don't seem to help.
fstcompose verbose output:
Here's what fstcompose looks like when running in verbose mode on my inputs (with no extra flags):
/opt/kaldi/tools/openfst/bin/fstcompose --v=100 HCLr.fst Gr.fst > HCLG.fst
INFO: FstImpl::ReadHeader: source: HCLr.fst, fst_type: olabel_lookahead, arc_type: standard, version: 1, flags: 0
INFO: FstImpl::ReadHeader: source: HCLr.fst, fst_type: const, arc_type: standard, version: 2, flags: 0
INFO: memorymap: false source: "HCLr.fst" size: 3273260 offset: 145
INFO: Read 3273260 bytes. 0 remaining
INFO: memorymap: false source: "HCLr.fst" size: 8777040 offset: 3273405
INFO: Read 8777040 bytes. 0 remaining
INFO: FstImpl::ReadHeader: source: Gr.fst, fst_type: ngram, arc_type: standard, version: 4, flags: 3
INFO: ComposeFstImpl: Match type: 3
INFO: # of calls: 3.54821e+07
INFO: # of intervals/call: 18.08
Full error dump and decoder invocation:
Here's an attempt to decode, along with the resulting failure and accompanying stack trace:
root@d1512f83e4a5:/workspace/models/vosk-model-small-pt-0.3# /opt/kaldi/src/cudadecoderbin/batched-wav-nnet3-cuda2 \
--num-channels=300 \
--cuda-use-tensor-cores=true \
--main-q-capacity=30000 \
--aux-q-capacity=400000 \
--cuda-memory-proportion=.5 \
--max-batch-size=200 \
--cuda-worker-threads=16 \
--cuda-decoder-copy-threads=2 \
--frame-subsampling-factor=3 \
--frames-per-chunk=153 \
--max-mem=100000000 \
--beam=10 \
--lattice-beam=7 \
--acoustic-scale=1.0 \
--determinize-lattice=true \
--max-active=10000 \
--iterations=1 \
--file-limit=500 \
--config=$MODELS/$MODEL/online.conf \
/workspace/models/vosk-model-small-pt-0.3/final.mdl \
/workspace/models/vosk-model-small-pt-0.3/HCLG.fst \
scp:/workspace/wav.scp \
'ark:|gzip -c > /workspace/lattice_test.gz'
/opt/kaldi/src/cudadecoderbin/batched-wav-nnet3-cuda2 --num-channels=300 --cuda-use-tensor-cores=true --main-q-capacity=30000 --aux-q-capacity=400000 --cuda-memory-proportion=.5 --max-batch-size=200 --cuda-worker-threads=16 --cuda-decoder-copy-threads=2 --frame-subsampling-factor=3 --frames-per-chunk=153 --max-mem=100000000 --beam=10 --lattice-beam=7 --acoustic-scale=1.0 --determinize-lattice=true --max-active=10000 --iterations=1 --file-limit=500 --config=/workspace/models/vosk-model-small-pt-0.3/online.conf /workspace/models/vosk-model-small-pt-0.3/final.mdl /workspace/models/vosk-model-small-pt-0.3/HCLG.fst scp:/workspace/wav.scp 'ark:|gzip -c > /workspace/lattice_test.gz'
WARNING (batched-wav-nnet3-cuda2[5.5]:SelectGpuId():cu-device.cc:243) Not in compute-exclusive mode. Suggestion: use 'nvidia-smi -c 3' to set compute exclusive mode
LOG (batched-wav-nnet3-cuda2[5.5]:SelectGpuIdAuto():cu-device.cc:438) Selecting from 1 GPUs
LOG (batched-wav-nnet3-cuda2[5.5]:SelectGpuIdAuto():cu-device.cc:453) cudaSetDevice(0): Tesla T4 free:14989M, used:120M, total:15109M, free/total:0.992058
LOG (batched-wav-nnet3-cuda2[5.5]:SelectGpuIdAuto():cu-device.cc:501) Device: 0, mem_ratio: 0.992058
LOG (batched-wav-nnet3-cuda2[5.5]:SelectGpuId():cu-device.cc:382) Trying to select device: 0
LOG (batched-wav-nnet3-cuda2[5.5]:SelectGpuIdAuto():cu-device.cc:511) Success selecting device 0 free mem ratio: 0.992058
LOG (batched-wav-nnet3-cuda2[5.5]:FinalizeActiveGpu():cu-device.cc:338) The active GPU is [0]: Tesla T4 free:14561M, used:548M, total:15109M, free/total:0.963732 version 7.5
LOG (batched-wav-nnet3-cuda2[5.5]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 1 orphan nodes.
LOG (batched-wav-nnet3-cuda2[5.5]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 2 orphan components.
LOG (batched-wav-nnet3-cuda2[5.5]:Collapse():nnet-utils.cc:1488) Added 1 components, removed 2
LOG (batched-wav-nnet3-cuda2[5.5]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (batched-wav-nnet3-cuda2[5.5]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (batched-wav-nnet3-cuda2[5.5]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (batched-wav-nnet3-cuda2[5.5]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (batched-wav-nnet3-cuda2[5.5]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (batched-wav-nnet3-cuda2[5.5]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (batched-wav-nnet3-cuda2[5.5]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (batched-wav-nnet3-cuda2[5.5]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
ASSERTION_FAILED (batched-wav-nnet3-cuda2[5.5]:TransitionIdToPdf():hmm/transition-model.h:328) Assertion failed: (static_cast<size_t>(trans_id) < id2pdf_id_.size() && "Likely graph/model mismatch (graph built from wrong model?)")
[ Stack-Trace: ]
/opt/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x793) [0x7fa32171e183]
/opt/kaldi/src/lib/libkaldi-base.so(kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)+0x72) [0x7fa32171eb84]
/opt/kaldi/src/lib/libkaldi-cudadecoder.so(kaldi::cuda_decoder::CudaFst::ApplyTransitionModelOnIlabels(kaldi::TransitionModel const&)+0x73) [0x7fa322f5d3bf]
/opt/kaldi/src/lib/libkaldi-cudadecoder.so(kaldi::cuda_decoder::CudaFst::Initialize(fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > > const&, kaldi::TransitionModel const*)+0x9e) [0x7fa322f5e980]
/opt/kaldi/src/lib/libkaldi-cudadecoder.so(kaldi::cuda_decoder::BatchedThreadedNnet3CudaOnlinePipeline::AllocateAndInitializeData(fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > > const&)+0x94f) [0x7fa322f5fdf1]
/opt/kaldi/src/lib/libkaldi-cudadecoder.so(kaldi::cuda_decoder::BatchedThreadedNnet3CudaOnlinePipeline::Initialize(fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > > const&)+0x20) [0x7fa322f6273c]
/opt/kaldi/src/lib/libkaldi-cudadecoder.so(kaldi::cuda_decoder::BatchedThreadedNnet3CudaOnlinePipeline::BatchedThreadedNnet3CudaOnlinePipeline(kaldi::cuda_decoder::BatchedThreadedNnet3CudaOnlinePipelineConfig const&, fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > > const&, kaldi::nnet3::AmNnetSimple const&, kaldi::TransitionModel const&)+0xa12) [0x7fa322f8cb7a]
/opt/kaldi/src/lib/libkaldi-cudadecoder.so(kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline2::BatchedThreadedNnet3CudaPipeline2(kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline2Config const&, fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > > const&, kaldi::nnet3::AmNnetSimple const&, kaldi::TransitionModel const&)+0x48) [0x7fa322f847ae]
/opt/kaldi/src/cudadecoderbin/batched-wav-nnet3-cuda2(main+0xe36) [0x56397b090c23]
/usr/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7fa320d3d0b3]
/opt/kaldi/src/cudadecoderbin/batched-wav-nnet3-cuda2(_start+0x2e) [0x56397b08d1ae]
Aborted (core dumped)