run_tdnn_1n.sh copy-int-vector Copied 0 vectors of int32 문제

246 views
Skip to first unread message

chlwjd...@gmail.com

unread,
Jul 24, 2020, 12:21:58 AM7/24/20
to zeroth-help
안녕하세요, kaldi-zeroth로 음성인식 실습을 하고 있습니다.
큰 도움을 받고 있어 먼저 감사하다는 말씀을 드립니다.

실습 진행 중 run_openslr.sh의 GMM-HMM까지는 정상적으로 완료했는데, online chain training의 run_tdnn_1n.sh에서 문제가 발생합니다.

analyze_phone_length_stats.py: WARNING: optional-silence SIL is seen only 60.63712704850066% of the time at utterance end.  This may not be optimal.
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri4b_ali_train_clean_sp/log/analyze_alignments.log
423 warnings in exp/tri4b_ali_train_clean_sp/log/fmllr.*.log
1 warnings in exp/tri4b_ali_train_clean_sp/log/analyze_alignments.log
1665 warnings in exp/tri4b_ali_train_clean_sp/log/align_pass2.*.log
1584 warnings in exp/tri4b_ali_train_clean_sp/log/align_pass1.*.log
local/nnet3/multi_condition/run_ivector_common.sh: creating reverberated MFCC features
utils/combine_data.sh data/train_clean_sp_rvb1 data/train_clean_sp/split20/1_rvb1 data/train_clean_sp/split20/2_rvb1 data/train_clean_sp/split20/3_rvb1 data/train_clean_sp/split20/4_rvb1 data/train_clean_sp/split20/5_rvb1 data/train_clean_sp/split20/6_rvb1 data/train_clean_sp/split20/7_rvb1 data/train_clean_sp/split20/8_rvb1 data/train_clean_sp/split20/9_rvb1 data/train_clean_sp/split20/10_rvb1 data/train_clean_sp/split20/11_rvb1 data/train_clean_sp/split20/12_rvb1 data/train_clean_sp/split20/13_rvb1 data/train_clean_sp/split20/14_rvb1 data/train_clean_sp/split20/15_rvb1 data/train_clean_sp/split20/16_rvb1 data/train_clean_sp/split20/17_rvb1 data/train_clean_sp/split20/18_rvb1 data/train_clean_sp/split20/19_rvb1 data/train_clean_sp/split20/20_rvb1
utils/combine_data.sh: combined utt2uniq
utils/combine_data.sh [info]: not combining segments as it does not exist
utils/combine_data.sh: combined utt2spk
utils/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/combine_data.sh [info]: not combining utt2dur as it does not exist
utils/combine_data.sh [info]: not combining utt2num_frames as it does not exist
utils/combine_data.sh [info]: not combining reco2dur as it does not exist
utils/combine_data.sh [info]: not combining feats.scp as it does not exist
utils/combine_data.sh: combined text
utils/combine_data.sh [info]: not combining cmvn.scp as it does not exist
utils/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/combine_data.sh: combined wav.scp
utils/combine_data.sh [info]: not combining spk2gender as it does not exist
fix_data_dir.sh: kept all 133578 utterances.
fix_data_dir.sh: old files are kept in data/train_clean_sp_rvb1/.backup
utils/copy_data_dir.sh: copied data from data/train_clean_sp_rvb1 to data/train_clean_sp_rvb1_hires
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_sp_rvb1_hires
utils/data/perturb_data_dir_volume.sh: added volume perturbation to the data in data/train_clean_sp_rvb1_hires
steps/make_mfcc.sh --nj 20 --mfcc-config conf/mfcc_hires.conf --cmd run.pl --mem 2G data/train_clean_sp_rvb1_hires exp/make_hires/train_clean_sp_rvb1 mfcc_rvb
utils/validate_data_dir.sh: Successfully validated data-directory data/train_clean_sp_rvb1_hires
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
steps/make_mfcc.sh: Succeeded creating MFCC features for train_clean_sp_rvb1_hires
steps/compute_cmvn_stats.sh data/train_clean_sp_rvb1_hires exp/make_hires/train_clean_sp_rvb1 mfcc_rvb
Succeeded creating CMVN stats for train_clean_sp_rvb1_hires
local/multi_condition/copy_ali_dir.sh: copied alignments from exp/tri4b_ali_train_clean_sp to exp/tri4b_ali_train_clean_sp_temp_0
local/multi_condition/copy_ali_dir.sh: copied alignments from exp/tri4b_ali_train_clean_sp to exp/tri4b_ali_train_clean_sp_temp_1
steps/combine_ali_dirs.sh data/train_clean_sp_rvb1 exp/tri4b_ali_train_clean_sp_rvb exp/tri4b_ali_train_clean_sp_temp_0 exp/tri4b_ali_train_clean_sp_temp_1
steps/combine_ali_dirs.sh: warning: Alignment lattices (lat.*.gz) are not present in exp/tri4b_ali_train_clean_sp_temp_0, not combining. Consider '--combine_lat false' to suppress this warning.
steps/combine_ali_dirs.sh: note: Temporary directory exp/tri4b_ali_train_clean_sp_rvb/temp.BNufvA will not be deleted in case of script failure, so you could examine it for troubleshooting.
steps/combine_ali_dirs.sh: Gathering alignments from each source directory.
run.pl: 2 / 2 failed, log is in exp/tri4b_ali_train_clean_sp_rvb/log/gather_alignments.*.log

# copy-int-vector "ark:gunzip -c $(cat exp/tri4b_ali_train_clean_sp_rvb/temp.BNufvA/src_arks.1) |" ark,scp:exp/tri4b_ali_train_clean_sp_rvb/temp.BNufvA/ali.1.ark,exp/tri4b_ali_train_clean_sp_rvb/temp.BNufvA/ali.1.scp 
# Started at Fri Jul 24 03:51:29 UTC 2020
#
copy-int-vector 'ark:gunzip -c exp/tri4b_ali_train_clean_sp_temp_0/ali.1.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.2.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.3.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.4.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.5.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.6.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.7.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.8.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.9.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.10.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.11.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.12.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.13.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.14.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.15.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.16.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.17.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.18.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.19.gz exp/tri4b_ali_train_clean_sp_temp_0/ali.20.gz  |' ark,scp:exp/tri4b_ali_train_clean_sp_rvb/temp.BNufvA/ali.1.ark,exp/tri4b_ali_train_clean_sp_rvb/temp.BNufvA/ali.1.scp
LOG (copy-int-vector[5.5]:main():copy-int-vector.cc:83) Copied 0 vectors of int32.
# Accounting: time=0 threads=1
# Ended (code 1) at Fri Jul 24 03:51:29 UTC 2020, elapsed time 0 seconds

저 말고도 비슷한 오류를 낸 사례들이 있는 것으로 확인되는데, naming convention으로 인한 오류로 진단해주셨으나 현재 kaldi 버전은 5월자 2c7e78f 기준이고, zeroth는 최신 버전입니다.
참고한 링크는 다음과 같습니다.
코드를 확인해봐도 지적해주신 부분들이 이미 수정이 된 것으로 확인되어, 검토를 부탁드립니다. 
num_jobs_initial 및 num_jobs_final은 1로 수정하였으며 nCPU도 코어 24개 기준 6으로 수정하였습니다.

nvcr.io/nvidia/kaldi:20.06-py3 버전을 docker pull하여 사용하였고, RAM은 64GB입니다. kaldi 최신 버전은 CUDA 11.0을 지원하지 않아 nvidia측 source를 사용하였습니다.
감사합니다.




Lucas Jo

unread,
Oct 21, 2020, 7:50:26 PM10/21/20
to zeroth-help
음 ... 이건 직접 돌려보지 않으면 코멘트가 어렵겠네요. 
아직 여전히 이런가요? 

혹시 여전히 해결이 되지 않는다면 코드리뷰를 해보겠습니다.

2020년 7월 23일 목요일 오후 10시 21분 58초 UTC-6에 chlwjd...@gmail.com님이 작성:

dboo

unread,
Sep 10, 2021, 12:31:16 AM9/10/21
to zeroth-help
저도 동일한 문제가 발생했습니다. 혹시 어떻게 해결하셨나요..?

2020년 10월 22일 목요일 오전 8시 50분 26초 UTC+9에 Lucas Jo님이 작성:
Reply all
Reply to author
Forward
0 new messages