extract phones with timing from lattice file

297 views
Skip to first unread message

Sai Reddy

unread,
May 5, 2022, 10:22:43 AM5/5/22
to kaldi-help
i created a  lattice file for a audio using the below commands:
(prior to this also did , fix_dir and compute_cmvn) 

#extract ivectors 
steps/online/nnet2/extract_ivectors.sh --nj 1 --cmd run.pl data/test_clean_hires data/lang exp/nnet3_cleaned/extractor exp/nnet3_cleaned/ivectors_test_clean_hires

#make a graph 
utils/mkgraph.sh --self-loop-scale 1.0 --remove-oov data/lang exp/nnet3_cleaned/tdnn_sp exp/nnet3_cleaned/tdnn_sp/graph_tgsmall

#decode using the graph (lattice file is created(lat.1.gz) ) 
steps/nnet3/decode.sh --acwt 1.0 --post-decode-acwt 10.0 --nj 1 --online-ivector-dir exp/nnet3_cleaned/ivector_test_clean_hires exp/nnet3_cleaned/tdnn_sp/graph_tgsmall data/test_clean_hires exp/nnet3_cleaned/tdnn_sp/decode_test_tgsmall

#to get the transcript
../../../src/latbin/lattice-best-path ark:'gunzip -c exp/nnet3_cleaned/tdnn_sp/decode_test_tgsmall/lat.1.gz |' ark,t:| utils/int2sym.pl -f 2- data/lang/words.txt > out.txt

now i got the transcript , but i want phonemes with the start and end time , like normal ctm files 
i have the lattice  file , i have checked most of the scripts to convert from lattice to phonemes , but not able to find a script suitable for my use case 

FYI , i checked this script (lattice-to-phone-lattice) 
but the output is other lattice file not txt or ctm file 

please let me know if there is any direct script or multiple scripts to get required output 

thanks.

sai..

Daniel Povey

unread,
May 5, 2022, 10:40:49 AM5/5/22
to kaldi-help
see steps/get_ctm.sh 

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/a7d84a80-41d4-4b7a-981e-693efa7e2373n%40googlegroups.com.

Sai Reddy

unread,
May 5, 2022, 1:07:20 PM5/5/22
to kaldi-help

Thanks for the fast reply Dan 

i ran it and i see in the output dir many dir's are  created with prefix score(20 dirs) and all have same ctm file with same data (this is for one single file) 
and i see the ctm file as the data at word level not at phoneme level , is there any option to get it phoneme level

Thanks.. 
sai..

Sai Reddy

unread,
May 6, 2022, 12:00:53 AM5/6/22
to kaldi-help
yesterday i got this (after running get_ctm.sh on the lattice ) i got word level data ,  i want phoneme level data :
Screenshot 2022-05-05 at 10.43.18 PM.png


Dan , i did try to convert it to phone-lattice(lattice-to-phone-lattice) and tried get_ctm.sh 
but it is not working , it is returning some weird results(below image)  and an error at the end 
Screenshot 2022-05-06 at 9.20.59 AM.png

Daniel Povey

unread,
May 6, 2022, 12:29:03 AM5/6/22
to kaldi-help
get_prons.sh writes words with phone sequences.

Sai Reddy

unread,
May 6, 2022, 12:41:30 AM5/6/22
to kaldi-help
Thanks for the reply Dan 

i tried the below steps and i got the required data , let me know if there is something wrong with it
next steps

src/latbin/lattice-1best --acoustic-scale=0.1 ark:lat.1 ark:1best.lats

src/latbin/nbest-to-linear ark:1best.lats ark:1best.ali 'ark,t:|int2sym.pl -f 2- words.txt > text'

src/bin/ali-to-phones --ctm-output exp/nnet3_cleaned/tdnn_sp/final.mdl ark:1best.ali 1best.ctm

the 1best.ctm file content looks like this , now i just need to map them to pure-phones and that's it Screenshot 2022-05-06 at 9.58.32 AM.png

Thank you very much , Dan 

Thanks..
sai..
Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
0 new messages