gstreamer에서 decode.sh의 rescore부분 호출 문의드립니다.

Hongseob Lee

unread,

Oct 20, 2021, 10:03:58 PM10/20/21

to zeroth-help

안녕하세요?

zeroth를 기본 open_slr로 학습하여서 decode.sh이 정상적으로 수행되었습니다.

(kaldi 및 zeroth master git tag 이용)

gstreamer를 연동하여 호출해서 결과도 정상적으로 출력됩니다.

그런데 decode.sh의 결과와 gstreamer의 결과를 비교해보면 decode.sh의 rescore가 동작 안한 것처럼 출력이 됩니다.

decode.sh 음성인석 결괴: file(speechDATA/test_data_01/003/121/121_003_0017.flac)

000 한 특허법인 은 전국 을 돌며 [55] 건 을 수임 [14] 억여원 을 벌어들인 것으로 나타났다

gstreamer의 결과

한 특허법인 은 전국 을 돌며 [50] [5] 건 을 수임 십 [4] 억여원 을 벌어들인 것으로 나타났다

gstreamer용 yaml파일은 다음과 같습니다.

use-nnet2: True
decoder:
    # All the properties nested here correspond to the kaldinnet2onlinedecoder GStreamer plugin properties.
    # Use gst-inspect-1.0 ./libgstkaldionline2.so kaldinnet2onlinedecoder to discover the available properties
    nnet-mode: 3
    use-threaded-decoder: true
    model : /opt/models/test/models/korean/zeroth/final.mdl
    word-syms : /opt/models/test/models/korean/zeroth/words.txt
    fst : /opt/models/test/models/korean/zeroth/HCLG.fst
    mfcc-config : /opt/models/test/models/korean/zeroth/conf/mfcc.conf
    ivector-extraction-config : /opt/models/test/models/korean/zeroth/conf/ivector_extractor.conf
    max-active: 7000
    min-active: 200
    beam: 15.0
    lattice-beam: 6.0
    acoustic-scale: 1.0
    do-endpointing : false
    endpoint-silence-phones : "1:2:3:4:5:6:7:8:9:10"
    traceback-period-in-secs: 0.25
    chunk-length-in-secs: 0.25
    num-nbest: 1
    #Additional functionality that you can play with:
    lm-fst: /opt/models/test/models/korean/zeroth/G.fst
    big-lm-const-arpa: /opt/models/test/models/korean/zeroth/G.carpa
    phone-syms: /opt/models/test/models/korean/zeroth/phones.txt
    word-boundary-file: /opt/models/test/models/korean/zeroth/word_boundary.int
    do-phone-alignment: true

out-dir: tmp

use-vad: False

silence-timeout: 10

혹시 yaml에서 설정이 가능한지, 아니면 gst-kaldi-nnet2-online를 수정해야 하는 것인지, 힌트를 알려주시면 감사하겠습니다.

SH Lee

unread,

Oct 25, 2021, 2:31:22 AM10/25/21

to zeroth-help

gstreamer-kaldi에서 사용하는 디코더와 zeroth의 decode.sh에서 사용하는 디코더가 다른 걸로 알고 있습니다.

gstreamer에서 사용할 수 있는 디코더가 한정적이어서 그런 것 같습니다

rescore 문제보다는 그 차이가 아닐까 싶습니다.

2021년 10월 21일 목요일 오전 11시 3분 58초 UTC+9에 shev...@gmail.com님이 작성:

Reply all

Reply to author

Forward

Message has been deleted