Need help on GOP

Amol Bole

unread,

Jun 6, 2022, 7:37:01 AM6/6/22

to kaldi-help

Dear All,

We are working on GOP and we got the following output.

35363738 [ 1 -0.04500699 ] [ 4 -0.736316 ] [ 22 -0.6936717 ] [ 33 -0.5055847 ] [ 14 -0.5047328 ] [ 1 -4.575368 ] [ 5 -0.0970664 ] [ 11 -0.8301041 ] [ 19 -1.232405 ] [ 32 -2.149931 ] [ 5 0 ] [ 25 -0.8108375 ] [ 1 -1.426894 ] [ 4 -0.9305243 ] [ 16 -1.102926 ] [ 33 -0.6984282 ] [ 14 -0.7835178 ] [ 1 0 ]

In the output [ 4 -0.736316 ] pure-phone is 4 and it's GOP -0.736316. Here, how we can understand correct phone or wrong phone? Also how we can convert this GOP value to accuracy? For reference I am attaching here GOP script. Correct me if I am wrong.

run_modified.sh

gop_output.txt

Aman Deep

unread,

Mar 31, 2023, 5:37:41 AM3/31/23

to kaldi-help

How did you compute the .txt file?? i.e gop.txt, i did not got the .txt file after i ran run.sh file??

李俊廷（ヤンヤン）

unread,

May 10, 2024, 10:01:25 PMMay 10

to kaldi-help

Hi, Amol Bole

For your first question: how do we understand the correct phone or the wrong phone?

Observing the transcribe you provided, you remove the stage of compile-train-graphs-without-lexicon, instead, you use align.sh here so the compile-train-graphs will be used. It will use the lexicon to recognize the best path of phoneme sequence. However, the speechocean762 corpus provides a text-phone regulation mapping for each word in each utterance in order to align with the phoneme level score targets.

To my best knowledge, we can not know if the pronunciation of a specific phone is correct or wrong via the GOP score. Instead, it is a problem discussed in mispronunciation diagnosis and detection. To name a few: https://ieeexplore.ieee.org/abstract/document/10097226/

Amol Bole 在 2022年6月6日星期一晚上7:37:01 [UTC+8] 的信中寫道：

Karel Veselý

unread,

May 13, 2024, 10:15:47 AMMay 13

to kaldi-help

Hi,
it seems that the format is [ <phone_index> <log_posterior> ].
If the scores were well calibrated, the default threshold p=0.5 corresponds to log-value -0.693 (higher is correct, lower is incorrect).

In practice, you may need to further calibrate these scores by passing through a logistic regression trained on labelled in-domain data.
Or, at least you can plot histogram of scores of correct/incorrect phones and set the threshold according to it.

Best,
Karel

Dne sobota 11. května 2024 v 4:01:25 UTC+2 uživatel 李俊廷（ヤンヤン） napsal:

Reply all

Reply to author

Forward