Re: [kaldi-help] British Dataset for decoding/Received Pronunciation - RP

61 views

Skip to first unread message

Message has been deleted

Shubham .

unread,

Apr 19, 2025, 6:10:24 PMApr 19

to kaldi...@googlegroups.com

Can you tell us the pipeline you followed and on which platform you deployed these models?

On Sun, Apr 20, 2025 at 12:32 AM Jayenthiran Pukuraj <jaya...@gmail.com> wrote:

Hi Team,

I've been learning and working with Kaldi for the past two months, and I'm happy to report that I’ve successfully managed to decode my own audio using the LibriSpeech dataset. However, my requirement is specifically for British English (Received Pronunciation - RP), while LibriSpeech is American English.

Could anyone kindly guide me on the best dataset for British english?

even its external dataset also fine we can able to train in kaldi is happy

please help me on this
--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/94cfa49d-2834-4774-a32c-a984120dada7n%40googlegroups.com.

Shubham

Research Scholar, Department of CSIS

BITS Pilani Hyderabad Campus

The information contained in this electronic communication is intended solely for the individual(s) or entity to which it is addressed. It may contain proprietary, confidential and/or legally privileged information. Any review, retransmission, dissemination, printing, copying or other use of, or taking any action in reliance on the contents of this information by person(s) or entities other than the intended recipient is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us by responding to this email or telephone and immediately and permanently delete all copies of this message and any attachments from your system(s). The contents of this message do not necessarily represent the views or policies of BITS Pilani.

Message has been deleted

Shubham .

unread,

Apr 21, 2025, 8:54:07 PMApr 21

to kaldi...@googlegroups.com

I am trying to create the model using Kaldi, but facing so many issues, also training. @jaya...@gmail.com have you tried to convert it for the embedded device or not

On Sun, Apr 20, 2025 at 5:08 PM Jayenthiran Pukuraj <jaya...@gmail.com> wrote:

For my ASR experimentation, I used the Kaldi toolkit with the mini_Librispeech and TED-LIUM datasets. Here's the pipeline I followed for phoneme-level decoding and evaluation:
✅ Pipeline Overview:

Data Preparation:

Created necessary Kaldi files like wav.scp, utt2spk, spk2utt, and text.

Used custom audio samples (e.g., mini3.wav) mapped to known transcriptions.

Feature Extraction:

Extracted MFCC features using steps/make_mfcc.sh.

Applied CMVN normalization via steps/compute_cmvn_stats.sh.

Model and Graph Preparation:

Used the pre-trained tri3b acoustic model.

Built decoding graph using utils/mkgraph.sh with lang_test_tgsmall.

Decoding:

Ran word-level decoding using steps/decode.sh.

For phoneme-level analysis, I modified the decoding pipeline to generate phones.ali using ali-to-phones and converted alignments to symbolic format.
🚀 Platform Details:

Environment: Ubuntu 20.04 LTS (on a virtual machine)

Deployment: Local deployment on a Linux-based system for offline experimentation.

Tools: Kaldi, Python for post-processing, Bash scripts for automation.

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/059c1b00-1801-42d7-a8ad-40c82b7ad399n%40googlegroups.com.

Message has been deleted

Shubham .

unread,

Apr 22, 2025, 10:46:18 PMApr 22

to kaldi...@googlegroups.com

--online-ivector-dir exp/nnet3/ivectors_train_clean_100_sp_sp_sp
CUDA not compiled. Proceeding with CPU mode (slower training)...
local/nnet3/run_ivector_common.sh: preparing directory for low-resolution speed-perturbed data (for alignment)
fix_data_dir.sh: kept all 428085 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp/.backup
utils/data/perturb_data_dir_speed_3way.sh: making sure the utt2dur and the reco2dur files are present
... in data/train_clean_100_sp_sp_sp, because obtaining it after speed-perturbing
... would be very slow, and you might need them.
utils/data/get_utt2dur.sh: data/train_clean_100_sp_sp_sp/utt2dur already exists with the expected length. We won't recompute it.
utils/data/get_reco2dur.sh: data/train_clean_100_sp_sp_sp/reco2dur already exists with the expected length. We won't recompute it.
utils/data/perturb_data_dir_speed.sh: generated speed-perturbed version of data in data/train_clean_100_sp_sp_sp, in data/train_clean_100_sp_sp_sp_sp_speed0.9
fix_data_dir.sh: kept all 428085 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp_speed0.9/.backup
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp_speed0.9
utils/data/perturb_data_dir_speed.sh: generated speed-perturbed version of data in data/train_clean_100_sp_sp_sp, in data/train_clean_100_sp_sp_sp_sp_speed1.1
fix_data_dir.sh: kept all 428085 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp_speed1.1/.backup
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp_speed1.1
utils/data/combine_data.sh data/train_clean_100_sp_sp_sp_sp data/train_clean_100_sp_sp_sp data/train_clean_100_sp_sp_sp_sp_speed0.9 data/train_clean_100_sp_sp_sp_sp_speed1.1
utils/data/combine_data.sh: combined utt2uniq
utils/data/combine_data.sh [info]: not combining segments as it does not exist
utils/data/combine_data.sh: combined utt2spk
utils/data/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/data/combine_data.sh: combined utt2dur
utils/data/combine_data.sh [info]: **not combining utt2num_frames as it does not exist everywhere**
utils/data/combine_data.sh: combined reco2dur
utils/data/combine_data.sh [info]: **not combining feats.scp as it does not exist everywhere**
utils/data/combine_data.sh: combined text
utils/data/combine_data.sh [info]: **not combining cmvn.scp as it does not exist everywhere**
utils/data/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/data/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/data/combine_data.sh: combined wav.scp
utils/data/combine_data.sh [info]: not combining spk2gender as it does not exist
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp/utt2spk is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp/text is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp/wav.scp is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp/utt2uniq is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp/utt2dur is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp/reco2dur is not in sorted order or not unique, sorting it
fix_data_dir.sh: kept all 884709 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp/.backup
utils/data/perturb_data_dir_speed_3way.sh: generated 3-way speed-perturbed version of data in data/train_clean_100_sp_sp_sp, in data/train_clean_100_sp_sp_sp_sp
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp
local/nnet3/run_ivector_common.sh: making MFCC features for low-resolution speed-perturbed data
steps/make_mfcc.sh --cmd run.pl --nj 50 data/train_clean_100_sp_sp_sp_sp
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
steps/make_mfcc.sh: Succeeded creating MFCC features for train_clean_100_sp_sp_sp_sp
steps/compute_cmvn_stats.sh data/train_clean_100_sp_sp_sp_sp
Succeeded creating CMVN stats for train_clean_100_sp_sp_sp_sp
local/nnet3/run_ivector_common.sh: fixing input data-dir to remove nonexistent features, in case some
.. speed-perturbed segments were too short.
fix_data_dir.sh: kept all 884709 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp/.backup
local/nnet3/run_ivector_common.sh: aligning with the perturbed low-resolution data
steps/align_fmllr.sh --nj 100 --cmd run.pl data/train_clean_100_sp_sp_sp_sp data/lang exp/tri3 exp/tri3_ali_train_clean_100_sp_sp_sp_sp
steps/align_fmllr.sh: feature type is lda
steps/align_fmllr.sh: compiling training graphs
steps/align_fmllr.sh: aligning data in data/train_clean_100_sp_sp_sp_sp using exp/tri3/final.mdl and speaker-independent features.

steps/align_fmllr.sh: computing fMLLR transforms
steps/align_fmllr.sh: doing final alignment.
steps/align_fmllr.sh: done aligning data.
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri3_ali_train_clean_100_sp_sp_sp_sp
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri3_ali_train_clean_100_sp_sp_sp_sp/log/analyze_alignments.log
22386 warnings in exp/tri3_ali_train_clean_100_sp_sp_sp_sp/log/align_pass2.*.log
2314 warnings in exp/tri3_ali_train_clean_100_sp_sp_sp_sp/log/fmllr.*.log
66419 warnings in exp/tri3_ali_train_clean_100_sp_sp_sp_sp/log/align_pass1.*.log
local/nnet3/run_ivector_common.sh: creating high-resolution MFCC features
utils/copy_data_dir.sh: copied data from data/train_clean_100_sp_sp_sp_sp to data/train_clean_100_sp_sp_sp_sp_hires
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp_hires
utils/copy_data_dir.sh: copied data from data/test_clean to data/test_clean_hires
utils/validate_data_dir.sh: Successfully validated data-directory data/test_clean_hires
copy_data_dir.sh: no such file data/test_other/utt2spk
(kaldi_py38) shubham@csis-ML10Gen9:~/kaldi/egs/librispeech/s5$
(kaldi_py38) shubham@csis-ML10Gen9:~/kaldi/egs/librispeech/s5$ local/nnet3/run_tdnn.sh \
--stage 0 \
--train-set train_clean_100_sp_sp_sp_sp \
--gmm tri3 \
--nnet3-affix "" \
--online-ivector-dir exp/nnet3/ivectors_train_clean_100_sp_sp_sp_sp
CUDA not compiled. Proceeding with CPU mode (slower training)...
local/nnet3/run_ivector_common.sh: preparing directory for low-resolution speed-perturbed data (for alignment)
fix_data_dir.sh: kept all 884709 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp/.backup
utils/data/perturb_data_dir_speed_3way.sh: making sure the utt2dur and the reco2dur files are present
... in data/train_clean_100_sp_sp_sp_sp, because obtaining it after speed-perturbing
... would be very slow, and you might need them.
utils/data/get_utt2dur.sh: data/train_clean_100_sp_sp_sp_sp/utt2dur already exists with the expected length. We won't recompute it.
utils/data/get_reco2dur.sh: data/train_clean_100_sp_sp_sp_sp/reco2dur already exists with the expected length. We won't recompute it.
utils/data/perturb_data_dir_speed.sh: generated speed-perturbed version of data in data/train_clean_100_sp_sp_sp_sp, in data/train_clean_100_sp_sp_sp_sp_sp_speed0.9
fix_data_dir.sh: kept all 884709 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp_sp_speed0.9/.backup
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp_sp_speed0.9
utils/data/perturb_data_dir_speed.sh: generated speed-perturbed version of data in data/train_clean_100_sp_sp_sp_sp, in data/train_clean_100_sp_sp_sp_sp_sp_speed1.1
fix_data_dir.sh: kept all 884709 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp_sp_speed1.1/.backup
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp_sp_speed1.1
utils/data/combine_data.sh data/train_clean_100_sp_sp_sp_sp_sp data/train_clean_100_sp_sp_sp_sp data/train_clean_100_sp_sp_sp_sp_sp_speed0.9 data/train_clean_100_sp_sp_sp_sp_sp_speed1.1
utils/data/combine_data.sh: combined utt2uniq
utils/data/combine_data.sh [info]: not combining segments as it does not exist
utils/data/combine_data.sh: combined utt2spk
utils/data/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/data/combine_data.sh: combined utt2dur
utils/data/combine_data.sh [info]: **not combining utt2num_frames as it does not exist everywhere**
utils/data/combine_data.sh: combined reco2dur
utils/data/combine_data.sh [info]: **not combining feats.scp as it does not exist everywhere**
utils/data/combine_data.sh: combined text
utils/data/combine_data.sh [info]: **not combining cmvn.scp as it does not exist everywhere**
utils/data/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/data/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/data/combine_data.sh: combined wav.scp
utils/data/combine_data.sh [info]: not combining spk2gender as it does not exist
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp_sp/utt2spk is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp_sp/text is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp_sp/wav.scp is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp_sp/utt2uniq is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp_sp/utt2dur is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp_sp/reco2dur is not in sorted order or not unique, sorting it
fix_data_dir.sh: kept all 1797957 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp_sp/.backup
utils/data/perturb_data_dir_speed_3way.sh: generated 3-way speed-perturbed version of data in data/train_clean_100_sp_sp_sp_sp, in data/train_clean_100_sp_sp_sp_sp_sp
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp_sp
local/nnet3/run_ivector_common.sh: making MFCC features for low-resolution speed-perturbed data
steps/make_mfcc.sh --cmd run.pl --nj 50 data/train_clean_100_sp_sp_sp_sp_sp
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp_sp
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
steps/make_mfcc.sh: Succeeded creating MFCC features for train_clean_100_sp_sp_sp_sp_sp
steps/compute_cmvn_stats.sh data/train_clean_100_sp_sp_sp_sp_sp
Succeeded creating CMVN stats for train_clean_100_sp_sp_sp_sp_sp
local/nnet3/run_ivector_common.sh: fixing input data-dir to remove nonexistent features, in case some
.. speed-perturbed segments were too short.
fix_data_dir.sh: kept all 1797957 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp_sp/.backup
local/nnet3/run_ivector_common.sh: aligning with the perturbed low-resolution data
steps/align_fmllr.sh --nj 100 --cmd run.pl data/train_clean_100_sp_sp_sp_sp_sp data/lang exp/tri3 exp/tri3_ali_train_clean_100_sp_sp_sp_sp_sp
steps/align_fmllr.sh: feature type is lda
steps/align_fmllr.sh: compiling training graphs
run.pl: 40 / 100 failed, log is in exp/tri3_ali_train_clean_100_sp_sp_sp_sp_sp/log/compile_graphs.*.log
(kaldi_py38) shubham@csis-ML10Gen9:~/kaldi/egs/librispeech/s5$ rm -rf data/train_clean_100_sp_sp_sp_sp_sp*
rm -rf exp/tri3_ali_train_clean_100_sp_sp_sp_sp_sp
(kaldi_py38) shubham@csis-ML10Gen9:~/kaldi/egs/librispeech/s5$ [ -f $train_data_dir/feats.scp ]
[ -f $train_ivector_dir/ivector_online.scp ]
[ -f $ali_dir/ali.1.gz ]
(kaldi_py38) shubham@csis-ML10Gen9:~/kaldi/egs/librispeech/s5$ du -sh data/ exp/
130G data/
68G exp/
(kaldi_py38) shubham@csis-ML10Gen9:~/kaldi/egs/librispeech/s5$ local/nnet3/run_tdnn.sh \
--stage 0 \
--train-set train_clean_100_sp_sp_sp_sp \
--gmm tri3 \
--nnet3-affix "" \
--online-ivector-dir exp/nnet3/ivectors_train_clean_100_sp_sp_sp_sp
CUDA not compiled. Proceeding with CPU mode (slower training)...
local/nnet3/run_ivector_common.sh: preparing directory for low-resolution speed-perturbed data (for alignment)
fix_data_dir.sh: kept all 884709 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp/.backup
utils/data/perturb_data_dir_speed_3way.sh: making sure the utt2dur and the reco2dur files are present
... in data/train_clean_100_sp_sp_sp_sp, because obtaining it after speed-perturbing
... would be very slow, and you might need them.
utils/data/get_utt2dur.sh: data/train_clean_100_sp_sp_sp_sp/utt2dur already exists with the expected length. We won't recompute it.
utils/data/get_reco2dur.sh: data/train_clean_100_sp_sp_sp_sp/reco2dur already exists with the expected length. We won't recompute it.
utils/data/perturb_data_dir_speed.sh: generated speed-perturbed version of data in data/train_clean_100_sp_sp_sp_sp, in data/train_clean_100_sp_sp_sp_sp_sp_speed0.9
fix_data_dir.sh: kept all 884709 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp_sp_speed0.9/.backup
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp_sp_speed0.9
utils/data/perturb_data_dir_speed.sh: generated speed-perturbed version of data in data/train_clean_100_sp_sp_sp_sp, in data/train_clean_100_sp_sp_sp_sp_sp_speed1.1
fix_data_dir.sh: kept all 884709 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp_sp_speed1.1/.backup
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp_sp_speed1.1
utils/data/combine_data.sh data/train_clean_100_sp_sp_sp_sp_sp data/train_clean_100_sp_sp_sp_sp data/train_clean_100_sp_sp_sp_sp_sp_speed0.9 data/train_clean_100_sp_sp_sp_sp_sp_speed1.1
utils/data/combine_data.sh: combined utt2uniq
utils/data/combine_data.sh [info]: not combining segments as it does not exist
utils/data/combine_data.sh: combined utt2spk
utils/data/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/data/combine_data.sh: combined utt2dur
utils/data/combine_data.sh [info]: **not combining utt2num_frames as it does not exist everywhere**
utils/data/combine_data.sh: combined reco2dur
utils/data/combine_data.sh [info]: **not combining feats.scp as it does not exist everywhere**
utils/data/combine_data.sh: combined text
utils/data/combine_data.sh [info]: **not combining cmvn.scp as it does not exist everywhere**
utils/data/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/data/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/data/combine_data.sh: combined wav.scp
utils/data/combine_data.sh [info]: not combining spk2gender as it does not exist
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp_sp/utt2spk is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp_sp/text is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp_sp/wav.scp is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp_sp/utt2uniq is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp_sp/utt2dur is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/train_clean_100_sp_sp_sp_sp_sp/reco2dur is not in sorted order or not unique, sorting it
fix_data_dir.sh: kept all 1797957 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_100_sp_sp_sp_sp_sp/.backup
utils/data/perturb_data_dir_speed_3way.sh: generated 3-way speed-perturbed version of data in data/train_clean_100_sp_sp_sp_sp, in data/train_clean_100_sp_sp_sp_sp_sp
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp_sp
local/nnet3/run_ivector_common.sh: making MFCC features for low-resolution speed-perturbed data
steps/make_mfcc.sh --cmd run.pl --nj 50 data/train_clean_100_sp_sp_sp_sp_sp
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_100_sp_sp_sp_sp_sp
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
i am trying to use TDNN but again and again it is failing I don't know why so

On Tue, Apr 22, 2025 at 5:30 PM Jayenthiran Pukuraj <jaya...@gmail.com> wrote:

Can you explain the query in the way i'm understanding.
My mail: jayav...@gmail.com

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/CAAzRmSTL-o5hE_nwhjQXwRY%2BULLZA3Oh%3DKfMJEokq9n7MPxRvQ%40mail.gmail.com.

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/CAAxjm1LhjBa-bg4gcANVApWnNK1HjFxinf85e5Cnw3xNYGAnfA%40mail.gmail.com.

Message has been deleted

Shubham .

unread,

Apr 23, 2025, 12:07:27 AMApr 23

to kaldi...@googlegroups.com

I am taking the help of the GPT, and why you are using Kaldi you can use any rebuilt model

On Wed, Apr 23, 2025 at 9:36 AM Jayenthiran Pukuraj <jaya...@gmail.com> wrote:

Hi Shubham,

I'm also a beginner to this KALDI.
I'm trying a different scenario to try British English phoneme level decoding.
when the user spoken a word i need phoneme level decoding and need to found where he mispronunced in particular phoneme sounds.

Can you guide me on this way or take a small lecture on how to achieve it?

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/CAAzRmSQjOsTn5CgBLUviNaVmHjmuCGUKtmRPLaFQXWjdDABZmw%40mail.gmail.com.

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/CAAxjm1JDNvb1bPwqS8RG%2Basfzv05YdNCbeKY6o7Pio9wExS6Nw%40mail.gmail.com.

Message has been deleted

Shubham .

unread,

Apr 23, 2025, 12:10:37 AMApr 23

to kaldi...@googlegroups.com

no you can train

On Wed, Apr 23, 2025 at 9:39 AM Jayenthiran Pukuraj <jaya...@gmail.com> wrote:

Because my English expert wants their phonemes to be given for the words particular for Indian dialects.
So if we take any working model we cannot train it on our own lexicon.txt word phonemes and audios right?

Moreover, I'm also working with the help of CHATGPT only.

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/CAAzRmSRKUZ-vypFhxw7_-XrLM9fKGRAj4uQQqfEO1gtey4ODTw%40mail.gmail.com.

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/CAAxjm1JA%2B7BwvZ%3Dbhg-Zk9xP%3D62md3XUBVYHHXaegOfxAsZU%3DQ%40mail.gmail.com.

Message has been deleted

Shubham .

unread,

Apr 23, 2025, 12:12:58 AMApr 23

to kaldi...@googlegroups.com

i dont know models you can search just ask gpt instead of kaldi

On Wed, Apr 23, 2025 at 9:42 AM Jayenthiran Pukuraj <jaya...@gmail.com> wrote:

Wow but i don't know this..i thought KALDI is the only way..
so tired for past 1 month
which models

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/CAAzRmSStJrncAHx6gavvC_pgfKZQ_oU8k8hw1-eiX5YY8X3-FQ%40mail.gmail.com.

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/CAAxjm1JGh58YMcR5968iacoZ%3DjTHWUR2rBdZcRZ-f0AtRu1TYA%40mail.gmail.com.

Message has been deleted

Shubham .

unread,

Apr 23, 2025, 12:57:10 AMApr 23

to kaldi...@googlegroups.com

On Wed, Apr 23, 2025 at 9:48 AM Jayenthiran Pukuraj <jaya...@gmail.com> wrote:

Even GPT says that the best option is KALDI + phoneme decoding
then only I dive into this big ocean.
now struggling to even understand the logic behind KALDI workflows

On Wed, Apr 23, 2025 at 9:46 AM Jayenthiran Pukuraj <jaya...@gmail.com> wrote:
Yes I searched in GPT but my requirement is taking input of student audio files and converting into phoneme decoding.
Also I need phoneme force alignment to help the students.

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/CAAzRmSR2E9mXXb-_LJnCEADS5%3DvS5UA4awpVKSY0%3DZ_r4h9siQ%40mail.gmail.com.

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/kaldi-help/CAAxjm1JEnmDcXXiWJhE-7PVFMJUfnZCedawYc0KpAqh_sP2vnQ%40mail.gmail.com.

Reply all

Reply to author

Forward

0 new messages