--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/766b576c-a31d-40a4-b135-64abf542a9a2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
wav is essentially just a container that can contain audio data coded in many different ways. Kaldi supports only linear PCM coding, your wav has audio stored in a different code. You could use sox to convert it. You can do it even on-the-fly, using this format of wav.scpaudio sox input.wav -t wav -r 16000 -b 16 - |(-- for example -- you will have to figure using the right switches using man, sox changes them once in a while)y.
On Mon, Apr 23, 2018 at 5:32 PM, <mkp...@umich.edu> wrote:
Hey Dan,I'm receiving this error message in log files when running make_mfcc.sh. What is PCM data? And what what does it mean by the format id in file is 3?# compute-mfcc-feats --verbose=2 --config=/z/mkperez/Replication/Scripts/config/mfcc_config scp,p:/z/mkperez/imuphone/kaldi/data/full_bassAlsoFull/test_adults_man/val_mfcc/log/wav_test_adults_man.1.scp ark:- | copy-feats --compress=true ark:- ark,scp:/z/mkperez/imuphone/kaldi/data/full_bassAlsoFull/test_adults_man/val_mfcc/raw_mfcc_test_adults_man.1.ark,/z/mkperez/imuphone/kaldi/data/full_bassAlsoFull/test_adults_man/val_mfcc/raw_mfcc_test_adults_man.1.scp# Started at Mon Apr 23 17:21:46 EDT 2018#compute-mfcc-feats --verbose=2 --config=/z/mkperez/Replication/Scripts/config/mfcc_config scp,p:/z/mkperez/imuphone/kaldi/data/full_bassAlsoFull/test_adults_man/val_mfcc/log/wav_test_adults_man.1.scp ark:-copy-feats --compress=true ark:- ark,scp:/z/mkperez/imuphone/kaldi/data/full_bassAlsoFull/test_adults_man/val_mfcc/raw_mfcc_test_adults_man.1.ark,/z/mkperez/imuphone/kaldi/data/full_bassAlsoFull/test_adults_man/val_mfcc/raw_mfcc_test_adults_man.1.scpERROR (compute-mfcc-feats[5.0.23~1-f7b2f]:Read():wave-reader.cc:170) WaveData: can read only PCM data, format id in file is: 3Thanks,MP
--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/0b35d763-52e1-42d4-b7e3-8cad98b446ce%40googlegroups.com.
Thank you so much for all the help, I was able to run decoding just fine, however I am achieving a WER rate (all deletions) of 100%.
I’m using the TIDIGITs dataset but I am investigating the whether or not ASR would be able to detect any phones/words which were said when the dataset is recorded with a sampling rate of 420Hz (hardware limited). For this reason my new_TIDIGITS dataset is recorded at 420Hz, but I’ve re-recorded this 420Hz sampling using a separate device at 8k sampling rate because other ASR toolkits don’t go as low as 420Hz. For the results above I was using the 8k sampling rate data, so I am now thinking of using the original 420Hz data samples, however, I am now getting an error:
# compute-mfcc-feats --verbose=2 --config=/z/mkperez/imuphone/kaldi/scripts//mfcc_config scp,p:/z/mkperez/imuphone/kaldi/data/full_bassAlsoFull/test_adults_man/val_mfcc/log/wav_test_adults_man.1.scp ark:- | copy-feats --compress=true ark:- ark,scp:/z/mkperez/imuphone/kaldi/data/full_bassAlsoFull/test_adults_man/val_mfcc/raw_mfcc_test_adults_man.1.ark,/z/mkperez/imuphone/kaldi/data/full_bassAlsoFull/test_adults_man/val_mfcc/raw_mfcc_test_adults_man.1.scp
# Started at Tue Apr 24 12:45:35 EDT 2018
#
compute-mfcc-feats --verbose=2 --config=/z/mkperez/imuphone/kaldi/scripts//mfcc_config scp,p:/z/mkperez/imuphone/kaldi/data/full_bassAlsoFull/test_adults_man/val_mfcc/log/wav_test_adults_man.1.scp ark:-
ASSERTION_FAILED (compute-mfcc-feats[5.0.23~1-f7b2f]:MelBanks():mel-computations.cc:126) : 'first_index != -1 && last_index >= first_index && "You may have set --num-mel-bins too large."'
From a high-level view would a Kaldi trained acoustic model be able to decode on 420Hz sampled data using mfcc features?
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/809cb2ff-afde-4f14-a72f-de031923aaec%40googlegroups.com.