Kaldi decoding error on large files

889 views
Skip to first unread message

Jaskaran Singh Puri

unread,
Jul 23, 2019, 1:45:56 PM7/23/19
to kaldi-help
Facing the following error while decoding multiple files of more than 10 minutes parallely

WARNING (nnet3-latgen-faster:CheckMemoryUsage():determinize-lattice-pruned.cc:327) Did not reach requested beam in determinize-lattice: size exceeds maximum 10000000 bytes;

Can we change this memory limit or how can we decode large files that too parallely?

Daniel Povey

unread,
Jul 23, 2019, 2:59:12 PM7/23/19
to kaldi-help
It's not an error, it's a warning.  It means that lattice generation generated a less-deep-than-normal lattice.
It shouldn't be a problem, it won't affect the one-best path which may be all you need
Dan


--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/d8c327fe-47aa-4467-aca2-7596b15f7c75%40googlegroups.com.

Jaskaran Singh Puri

unread,
Jul 24, 2019, 9:05:26 AM7/24/19
to kaldi-help
So this is my configuration.

I'm using Nvidia Docker and using GPU to decode my files. Have around 500GB ram, a single 16GB Tesla v100 GPU, on a 28 core machine

I tried to send a batch of 600 files with total size of 3.6 GB, around 48 Hours of data, ie 600 files in parallel
I get the following error, although it works when I reduce my input to 10 files, i.e. 10 files in parallel

Even on a 16 gb gpu, I should be able to load 3.6 gb files in parallel, but if you see this error, it says GPU out of memory. 
If I run it in exclusive mode will it still fail? Or is there any other work around for this?

Have used the same script to decode the files as given in Nvidia examples

stdbuf -o 0 $KALDI_ROOT/src/$path/$decoder $cuda_flags --frame-subsampling-factor=3 --config="/notebooks/jpuri/aspire_original/s5/exp/tdnn_7b_chain_online/conf/online.conf" --frames-per-chunk=264  --file-limit=-1 --max-mem=10000000 --beam=30 --lattice-beam=6.0 --acoustic-scale=1.0 --determinize-lattice=true --max-active=10000 --word-symbol-table="/notebooks/jpuri/aspire_original/s5/exp/tdnn_7b_chain_online/graph_pp/words.txt" /notebooks/jpuri/aspire_original/s5/exp/tdnn_7b_chain_online/final.mdl /notebooks/jpuri/aspire_original/s5/exp/tdnn_7b_chain_online/graph_pp/HCLG.fst "scp:/notebooks/jpuri/corning_gpu/wav.scp" "ark:|gzip -c > /notebooks/jpuri/aspire_original/s5/exp/tdnn_7b_chain_online/decode_gpu/lat.$decoder.corning_gpu_60.gz" 2>&1 | tee gpu.log
/opt/kaldi/src/cudadecoderbin/batched-wav-nnet3-cuda --cuda-use-tensor-cores=true --iterations=1 --main-q-capacity=40000 --aux-q-capacity=500000 --cuda-memory-proportion=.5 --max-batch-size=200 --cuda-control-threads=2 --batch-drain-size=15 --cuda-worker-threads=54 --frame-subsampling-factor=3 --config=/notebooks/jpuri/aspire_original/s5/exp/tdnn_7b_chain_online/conf/online.conf --frames-per-chunk=264 --file-limit=-1 --max-mem=10000000 --beam=30 --lattice-beam=6.0 --acoustic-scale=1.0 --determinize-lattice=true --max-active=10000 --word-symbol-table=/notebooks/jpuri/aspire_original/s5/exp/tdnn_7b_chain_online/graph_pp/words.txt /notebooks/jpuri/aspire_original/s5/exp/tdnn_7b_chain_online/final.mdl /notebooks/jpuri/aspire_original/s5/exp/tdnn_7b_chain_online/graph_pp/HCLG.fst scp:/notebooks/jpuri/corning_gpu/wav.scp 'ark:|gzip -c > /notebooks/jpuri/aspire_original/s5/exp/tdnn_7b_chain_online/decode_gpu/lat.batched-wav-nnet3-cuda.corning_gpu_60.gz'
WARNING
(batched-wav-nnet3-cuda[5.5]:SelectGpuId():cu-device.cc:221) Not in compute-exclusive mode.  Suggestion: use 'nvidia-smi -c 3' to set compute exclusive mode
LOG
(batched-wav-nnet3-cuda[5.5]:SelectGpuIdAuto():cu-device.cc:349) Selecting from 1 GPUs
LOG
(batched-wav-nnet3-cuda[5.5]:SelectGpuIdAuto():cu-device.cc:364) cudaSetDevice(0): Tesla V100-SXM2-16GB     free:15756M, used:374M, total:16130M, free/total:0.976791
LOG
(batched-wav-nnet3-cuda[5.5]:SelectGpuIdAuto():cu-device.cc:411) Trying to select device: 0 (automatically), mem_ratio: 0.976791
LOG
(batched-wav-nnet3-cuda[5.5]:SelectGpuIdAuto():cu-device.cc:430) Success selecting device 0 free mem ratio: 0.976791
LOG
(batched-wav-nnet3-cuda[5.5]:FinalizeActiveGpu():cu-device.cc:284) The active GPU is [0]: Tesla V100-SXM2-16GB      free:15592M, used:538M, total:16130M, free/total:0.966624 version 7.0
LOG
(batched-wav-nnet3-cuda[5.5]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 1 orphan nodes.
LOG
(batched-wav-nnet3-cuda[5.5]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 2 orphan components.
LOG
(batched-wav-nnet3-cuda[5.5]:Collapse():nnet-utils.cc:1378) Added 1 components, removed 2
LOG
(batched-wav-nnet3-cuda[5.5]:Initialize():batched-threaded-nnet3-cuda-pipeline.cc:32) BatchedThreadedNnet3CudaPipeline Initialize with 2 control threads, 54 worker threads and batch size 200
LOG
(batched-wav-nnet3-cuda[5.5]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG
(batched-wav-nnet3-cuda[5.5]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG
(batched-wav-nnet3-cuda[5.5]:AllocateNewRegion():cu-allocator.cc:506) About to allocate new memory region of 4087349248 bytes; current memory info is: free:7794M, used:8336M, total:16130M, free/total:0.483192
LOG
(batched-wav-nnet3-cuda[5.5]:AllocateNewRegion():cu-allocator.cc:506) About to allocate new memory region of 2043674624 bytes; current memory info is: free:3896M, used:12234M, total:16130M, free/total:0.241538
LOG
(batched-wav-nnet3-cuda[5.5]:AllocateNewRegion():cu-allocator.cc:506) About to allocate new memory region of 1021313024 bytes; current memory info is: free:1946M, used:14184M, total:16130M, free/total:0.120649
LOG
(batched-wav-nnet3-cuda[5.5]:AllocateNewRegion():cu-allocator.cc:506) About to allocate new memory region of 510656512 bytes; current memory info is: free:972M, used:15158M, total:16130M, free/total:0.0602663
LOG
(batched-wav-nnet3-cuda[5.5]:AllocateNewRegion():cu-allocator.cc:506) About to allocate new memory region of 390070272 bytes; current memory info is: free:484M, used:15646M, total:16130M, free/total:0.030013
LOG
(batched-wav-nnet3-cuda[5.5]:AllocateNewRegion():cu-allocator.cc:506) About to allocate new memory region of 201326592 bytes; current memory info is: free:112M, used:16018M, total:16130M, free/total:0.00695112
LOG (batched-wav-nnet3-cuda[5.5]:PrintMemoryUsage():cu-allocator.cc:368) Memory usage: 15033003008/16228810752 bytes currently allocated/total-held; 1962/25 blocks currently allocated/free; largest free/allocated block sizes are 800000000/389283840; time taken total/cudaMalloc is 0.0389941/0.0269516, synchronized the GPU 60 times out of 773 frees; device memory info: free:112M, used:16018M, total:16130M, free/total:0.00695112maximum allocated: 15270767616current allocated: 15033003008
ERROR (batched-wav-nnet3-cuda[5.5]:AllocateNewRegion():cu-allocator.cc:519) Failed to allocate a memory region of 201326592 bytes.  Possibly this is due to sharing the GPU.  Try switching the GPUs to exclusive mode (nvidia-smi -c 3) and using the option --use-gpu=wait to scripts like steps/nnet3/chain/train.py.  Memory info: free:112M, used:16018M, total:16130M, free/total:0.00695112

[ Stack-Trace: ]
kaldi::MessageLogger::LogMessage() const
kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)
kaldi::CuMemoryAllocator::AllocateNewRegion(unsigned long)
kaldi::CuMemoryAllocator::MallocPitch(unsigned long, unsigned long, unsigned long*)
kaldi::CuMatrix<float>::Resize(int, int, kaldi::MatrixResizeType, kaldi::MatrixStrideType)
kaldi::nnet3::NnetComputer::ExecuteCommand()
kaldi::nnet3::NnetComputer::Run()
kaldi::nnet3::NnetBatchComputer::Compute(bool)
kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline::ComputeBatchNnet(kaldi::nnet3::NnetBatchComputer&, int, std::vector<kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline::TaskState*, std::allocator<kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline::TaskState*> >&)
kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline::ExecuteWorker(int)
std::thread::_Impl<std::_Bind_simple<std::_Mem_fn<void (kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline::*)(int)> (kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline*, int)> >::_M_run()


clone
WARNING (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:436) Printing some background info since error was detected
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:437) matrix m1(37504, 40), m2(128, 100), m3(36992, 220), m4(36992, 1024), m5(12288, 4096), m6(12288, 3072), m7(12032, 3072), m8(11776, 3072), m9(11520, 3072), m10(11264, 1024), m11(11264, 1024), m12(11264, 8629), m13(11264, 8629)
# The following show how matrices correspond to network-nodes and
# cindex-ids.  Format is: matrix = <node-id>.[value|deriv][ <list-of-cindex-ids> ]
# where a cindex-id is written as (n,t[,x]) but ranges of t values are compressed
# so we write (n, tfirst:tlast).
m1 == value: input[(0,-17:275), (1,-17:275), (2,-17:275), (3,-17:275), (4,-17:275), (5,-17:275), (6,-17:275), (7,-17:2 ... ,-17:275), (122,-17:275), (123,-17:275), (124,-17:275), (125,-17:275), (126,-17:275), (127,-17:275)]
m2 == value: ivector[(0,0), (1,0), (2,0), (3,0), (4,0), (5,0), (6,0), (7,0), (8,0), (9,0), (10,0), (11,0), (12,0), (13,0 ... , (117,0), (118,0), (119,0), (120,0), (121,0), (122,0), (123,0), (124,0), (125,0), (126,0), (127,0)]
m3 == value: Tdnn_0_affine_input[(0,-16), (1,-16), (2,-16), (3,-16), (4,-16), (5,-16), (6,-16), (7,-16), (8,-16), (9,-16), (10,-16), ... , (119,272), (120,272), (121,272), (122,272), (123,272), (124,272), (125,272), (126,272), (127,272)]
m4 == value: Tdnn_0_affine[(0,-16), (1,-16), (2,-16), (3,-16), (4,-16), (5,-16), (6,-16), (7,-16), (8,-16), (9,-16), (10,-16), ... , (119,272), (120,272), (121,272), (122,272), (123,272), (124,272), (125,272), (126,272), (127,272)]
m5 == value: Tdnn_1_affine_input[(0,-15), (1,-15), (2,-15), (3,-15), (4,-15), (5,-15), (6,-15), (7,-15), (8,-15), (9,-15), (10,-15), ... , (119,270), (120,270), (121,270), (122,270), (123,270), (124,270), (125,270), (126,270), (127,270)]
m6 == value: Tdnn_2_affine_input[(0,-12), (1,-12), (2,-12), (3,-12), (4,-12), (5,-12), (6,-12), (7,-12), (8,-12), (9,-12), (10,-12), ... 648), (123,-2147483648), (124,-2147483648), (125,-2147483648), (126,-2147483648), (127,-2147483648)]
m7 == value: Tdnn_3_affine_input[(0,-9), (1,-9), (2,-9), (3,-9), (4,-9), (5,-9), (6,-9), (7,-9), (8,-9), (9,-9), (10,-9), (11,-9), ( ... 648), (123,-2147483648), (124,-2147483648), (125,-2147483648), (126,-2147483648), (127,-2147483648)]
m8 == value: Tdnn_4_affine_input[(0,-6), (1,-6), (2,-6), (3,-6), (4,-6), (5,-6), (6,-6), (7,-6), (8,-6), (9,-6), (10,-6), (11,-6), ( ... 648), (123,-2147483648), (124,-2147483648), (125,-2147483648), (126,-2147483648), (127,-2147483648)]
m9 == value: Tdnn_5_affine_input[(0,0), (1,0), (2,0), (3,0), (4,0), (5,0), (6,0), (7,0), (8,0), (9,0), (10,0), (11,0), (12,0), (13,0 ... 648), (123,-2147483648), (124,-2147483648), (125,-2147483648), (126,-2147483648), (127,-2147483648)]
m10 == value: Tdnn_5_affine[(0,0), (1,0), (2,0), (3,0), (4,0), (5,0), (6,0), (7,0), (8,0), (9,0), (10,0), (11,0), (12,0), (13,0 ... , (119,261), (120,261), (121,261), (122,261), (123,261), (124,261), (125,261), (126,261), (127,261)]
m11 == value: Tdnn_pre_final_chain_affine[(0,0), (1,0), (2,0), (3,0), (4,0), (5,0), (6,0), (7,0), (8,0), (9,0), (10,0), (11,0), (12,0), (13,0 ... , (119,261), (120,261), (121,261), (122,261), (123,261), (124,261), (125,261), (126,261), (127,261)]
m12 == value: Final_affine[(0,0), (1,0), (2,0), (3,0), (4,0), (5,0), (6,0), (7,0), (8,0), (9,0), (10,0), (11,0), (12,0), (13,0 ... , (119,261), (120,261), (121,261), (122,261), (123,261), (124,261), (125,261), (126,261), (127,261)]
m13 == value: output[(0,0), (0,3), (0,6), (0,9), (0,12), (0,15), (0,18), (0,21), (0,24), (0,27), (0,30), (0,33), (0,36), ... , (127,237), (127,240), (127,243), (127,246), (127,249), (127,252), (127,255), (127,258), (127,261)]

LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c0: m1 = user input [for node: 'input']
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c1: m2 = user input [for node: 'ivector']
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c2: [no-op-permanent]
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c3: m3 = undefined(36992,220)
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c4: m3(0:36991, 0:39).CopyRows(1, m1[0, 293, 586, 879, 1172, 1465, 1758, 2051, 2344, 2637, 2930, 322 ...  33690, 33983, 34276, 34569, 34862, 35155, 35448, 35741, 36034, 36327, 36620, 36913, 37206, 37499])
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c5: m3(0:36991, 40:79).CopyRows(1, m1[1, 294, 587, 880, 1173, 1466, 1759, 2052, 2345, 2638, 2931, 32 ...  33691, 33984, 34277, 34570, 34863, 35156, 35449, 35742, 36035, 36328, 36621, 36914, 37207, 37500])
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c6: m3(0:36991, 80:119).CopyRows(1, m1[2, 295, 588, 881, 1174, 1467, 1760, 2053, 2346, 2639, 2932, 3 ...  33692, 33985, 34278, 34571, 34864, 35157, 35450, 35743, 36036, 36329, 36622, 36915, 37208, 37501])
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c7: m1 = []
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c8: m3(0:36991, 120:219).CopyRows(1, m2[0:127, 0:127, 0:127, 0:127, 0:127, 0:127, 0:127, 0:127, 0:12 ...  0:127, 0:127, 0:127, 0:127, 0:127, 0:127, 0:127, 0:127, 0:127, 0:127, 0:127, 0:127, 0:127, 0:127])
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c9: m2 = []
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c10: m4 = undefined(36992,1024)
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c11: L0_fixaffine.Tdnn_0_affine.Propagate(NULL, m3, &m4)
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c12: m3 = []
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c13: Tdnn_0_relu.Propagate(NULL, m4, &m4)
LOG (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:439) c14: Tdnn_0_renorm.Propagate(NULL, m4, &m4)
ERROR (batched-wav-nnet3-cuda[5.5]:ExecuteCommand():nnet-compute.cc:443) Error running command c15: m5 = undefined(12288,4096)

[ Stack-Trace: ]
kaldi::MessageLogger::LogMessage() const
kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)
kaldi::nnet3::NnetComputer::ExecuteCommand()
kaldi::nnet3::NnetComputer::Run()
kaldi::nnet3::NnetBatchComputer::Compute(bool)
kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline::ComputeBatchNnet(kaldi::nnet3::NnetBatchComputer&, int, std::vector<kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline::TaskState*, std::allocator<kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline::TaskState*> >&)
kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline::ExecuteWorker(int)
std::thread::_Impl<std::_Bind_simple<std::_Mem_fn<void (kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline::*)(int)> (kaldi::cuda_decoder::BatchedThreadedNnet3CudaPipeline*, int)> >::_M_run()


clone

LOG (batched-wav-nnet3-cuda[5.5]:PrintMinibatchStats():nnet-batch-compute.cc:104) Minibatch stats: seconds-taken,frames-in:frames-out*minibatch-size=num-done(percent-full%)  0.44,293:88*128=15(96%) 0.00,293:88*16=1(68%)
LOG (batched-wav-nnet3-cuda[5.5]:PrintMinibatchStats():nnet-batch-compute.cc:105) Did 1868 tasks in 16 minibatches, taking 0.44438 seconds.
ERROR (batched-wav-nnet3-cuda[5.5]:~NnetBatchComputer():nnet-batch-compute.cc:119) Tasks are pending but object is being destroyed


On Wednesday, July 24, 2019 at 12:29:12 AM UTC+5:30, Dan Povey wrote:
It's not an error, it's a warning.  It means that lattice generation generated a less-deep-than-normal lattice.
It shouldn't be a problem, it won't affect the one-best path which may be all you need
Dan


On Tue, Jul 23, 2019 at 10:45 AM Jaskaran Singh Puri <jaskar...@gmail.com> wrote:
Facing the following error while decoding multiple files of more than 10 minutes parallely

WARNING (nnet3-latgen-faster:CheckMemoryUsage():determinize-lattice-pruned.cc:327) Did not reach requested beam in determinize-lattice: size exceeds maximum 10000000 bytes;

Can we change this memory limit or how can we decode large files that too parallely?

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Daniel Povey

unread,
Jul 24, 2019, 2:36:08 PM7/24/19
to kaldi-help, Justin Luitjens, Hugo Braun
Justin and/or Hugo might be able to advise which of the CUDA decoder parameters are critical w.r.t. controlling GPU memory usage.  Likely the file length is not an issue as they would be stored in CPU memory.
Guys: since you changed the feature extraction to run on GPU (not sure if that has been merged yet), would that have increased GPU memory usage if the audio files are long?

Da


To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/025fb0f6-aa0c-4d67-ad59-bcb1d2c0ddc2%40googlegroups.com.

Justin Luitjens

unread,
Jul 24, 2019, 3:01:27 PM7/24/19
to dpo...@gmail.com, kaldi-help, Hugo Braun

Feature extraction currently has memory usage O(audio length) as does nnet3.  I’d suggest segmenting audio to about 30 seconds at a time.  However, I have noticed that extract-segments in an “ark:” command becomes a bottleneck so do the segmenting as a preprocessing step.

 

If this doesn’t solve your issue then we can look at other parameters (max batch size and gpu-control-threads being the two primary parameters).


This email message is for the sole use of the intended recipient(s) and may contain confidential information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

Daniel Povey

unread,
Jul 24, 2019, 3:08:33 PM7/24/19
to Justin Luitjens, kaldi-help, Hugo Braun
If the length of the audio files is an issue, perhaps --gpu-feature-extract=false would help.
But it's probably not a good idea, in general, to decode super-long files; algorithms like
OpenFst's best-path tend to break down for very long lattices.

Jaskaran Singh Puri

unread,
Jul 25, 2019, 12:54:49 PM7/25/19
to kaldi-help
What are some scripts that can be used to segment large files, where training the model is not an option, like segment_long_utt, where input model has to be given

As this is for decoding new files only

Daniel Povey

unread,
Jul 25, 2019, 2:20:55 PM7/25/19
to kaldi-help
I suggest to just segment the files equally with a slightly overlap and piece together the ctm's afterward.  The following may give you a starting point.

git grep get_uniform_subsegments.py
egs/aspire/s5/local/generate_uniformly_segmented_data_dir.sh:  utils/data/get_uniform_subsegments.py \
egs/callhome_diarization/v1/diarization/extract_ivectors.sh:  utils/data/get_uniform_subsegments.py \
egs/callhome_diarization/v1/diarization/nnet3/xvector/extract_xvectors.sh:  utils/data/get_uniform_subsegments.py \
egs/material/s5/local/preprocess_test.sh:    utils/data/get_uniform_subsegments.py --max-segment-duration=30 \
egs/material/s5/local/preprocess_test.sh:    utils/data/get_uniform_subsegments.py --max-segment-duration=30 \
egs/wsj/s5/steps/cleanup/segment_long_utterances.sh:  utils/data/get_uniform_subsegments.py \
egs/wsj/s5/steps/cleanup/segment_long_utterances_nnet3.sh:  utils/data/get_uniform_subsegments.py \
egs/wsj/s5/steps/segmentation/prepare_targets_gmm.sh:  utils/data/get_uniform_subsegments.py \
egs/wsj/s5/utils/data/get_uniform_subsegments.py:        e.g.: get_uniform_subsegments.py data/dev/segments > \\



 git grep resolve_ctm
egs/aspire/s5/local/multi_condition/get_ctm.sh:  utils/ctm/resolve_ctm_overlaps.py $data_dir/segments \
egs/babel/s5d/local/lattice_to_ctm.sh:  resolve_overlaps_cmd="utils/ctm/resolve_ctm_overlaps.py $data/segments - -"
egs/hub4_english/s5/local/score_sclite.sh:resolve_ctm_overlaps=false
egs/hub4_english/s5/local/score_sclite.sh:    if $resolve_ctm_overlaps; then
egs/hub4_english/s5/local/score_sclite.sh:      $cmd LMWT=$min_lmwt:$max_lmwt $dir/scoring/log/resolve_ctm_overlaps.LMWT.${wip}.log \
egs/hub4_english/s5/local/score_sclite.sh:        utils/ctm/resolve_ctm_overlaps.py $data/segments \
egs/material/s5/local/postprocess_test.sh:  utils/ctm/resolve_ctm_overlaps.py data/${data}_hires/segments \
egs/wsj/s5/steps/cleanup/internal/resolve_ctm_edits_overlaps.py:smith-waterman alignment. This script is similar to utils/ctm/resolve_ctm_edits.py,
egs/wsj/s5/steps/cleanup/segment_long_utterances.sh:  $cmd $dir/log/resolve_ctm_edits.log \
egs/wsj/s5/steps/cleanup/segment_long_utterances.sh:    steps/cleanup/internal/resolve_ctm_edits_overlaps.py \
egs/wsj/s5/steps/cleanup/segment_long_utterances_nnet3.sh:  $cmd $dir/log/resolve_ctm_edits.log \
egs/wsj/s5/steps/cleanup/segment_long_utterances_nnet3.sh:    steps/cleanup/internal/resolve_ctm_edits_overlaps.py \
egs/wsj/s5/utils/data/subsegment_data_dir.sh:  echo "  See also: resolve_ctm_overlaps.py"

On Thu, Jul 25, 2019 at 9:54 AM Jaskaran Singh Puri <jaskar...@gmail.com> wrote:
What are some scripts that can be used to segment large files, where training the model is not an option, like segment_long_utt, where input model has to be given

As this is for decoding new files only

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

Jaskaran Singh Puri

unread,
Jul 26, 2019, 4:09:23 AM7/26/19
to kaldi-help
But batched-wav-nnet3-cuda does not have the option to decode utterance by utterance, I'm not able to pass any spk2utt file, is there a work around?
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Justin Luitjens

unread,
Jul 26, 2019, 9:41:24 AM7/26/19
to kaldi...@googlegroups.com
Spk2utt is not what you are looking for.  The binary already decides utterance by utterance.  You just need to break it up into segments using something like extract-segments.  We don’t need spk2utt because we don’t use that information in decoding

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/94227b3b-a315-4062-b972-378da46a520a%40googlegroups.com.

Jaskaran Singh Puri

unread,
Jul 26, 2019, 9:52:50 AM7/26/19
to kaldi...@googlegroups.com
No but how do I pass these segments file into the decoding script because the batchedwavnnet3 only takes in wav.scp files as input in addition to model files.

Unlike online2wavnnetfaster, which takes in spk2utt as well, not sure how to pass these segments or other files. 

I've created thos segments and similar files as Dan suggested. There's only 1 nvidia example of decoding and it directly takes in wav.scp only 

You received this message because you are subscribed to a topic in the Google Groups "kaldi-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kaldi-help/yNdfPGDPifI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CALCo63ePhkjSXz9TGwcU0anTau2yzrKVSWBnv%2B6pN%3DwavbqvOw%40mail.gmail.com.

Justin Luitjens

unread,
Jul 26, 2019, 10:46:51 AM7/26/19
to kaldi...@googlegroups.com
You can either do an inline call to extract segments or a preprocessing call.
Something like this
  $KALDI_ROOT/src/featbin/extract-segments scp:$DATASET_PATH/$test_set/$wavscp $DATASET_PATH/$test_set/segments ark:- > segs.$test_set

Message has been deleted

Daniel Povey

unread,
Jul 27, 2019, 2:12:10 PM7/27/19
to kaldi-help
Yes, see https://kaldi-asr.org/doc/io_tut.html to understand the basic principles


On Sat, Jul 27, 2019 at 9:43 AM Jaskaran Singh Puri <jaskar...@gmail.com> wrote:
And how exactly is the output from this command to be used in my decoding?
Will the output file replace wav.scp?


--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages